Kanna: An XML/HTML parser for Swift

Kanna(鉋)

Kanna(鉋) is an XML/HTML parser for cross-platform(macOS, iOS, tvOS, watchOS and Linux!).

It was inspired by Nokogiri(鋸).






:information_source: Documentation

Features

  • [x] XPath 1.0 support for document searching
  • [x] CSS3 selector support for document searching
  • [x] Support for namespaces
  • [x] Comprehensive test suite

Installation for Swift 5

CocoaPods

Add the following to your Podfile:

use_frameworks!
pod 'Kanna', '~> 5.2.2'

Carthage

Add the following to your Cartfile:

github "tid-kijyun/Kanna" ~> 5.2.2

For xcode 11.3 and earlier, the following settings are required.

  1. In the project settings add $(SDKROOT)/usr/include/libxml2 to the "header search paths" field

Swift Package Manager

  1. Installing libxml2 to your computer:
// macOS: For xcode 11.3 and earlier, the following settings are required.
$ brew install libxml2
$ brew link --force libxml2

// Linux(Ubuntu):
$ sudo apt-get install libxml2-dev
  1. Add the following to your Package.swift:
// swift-tools-version:5.0
import PackageDescription

let package = Package(
    name: "YourProject",
    dependencies: [
        .package(url: "https://github.com/tid-kijyun/Kanna.git", from: "5.2.2"),
    ],
    targets: [
        .target(
            name: "YourTarget",
            dependencies: ["Kanna"]),
    ]
)
$ swift build

Note: When a build error occurs, please try run the following command:

// Linux(Ubuntu)
$ sudo apt-get install pkg-config

Manual Installation

  1. Add these files to your project:
    Kanna.swift
    CSS.swift
    libxmlHTMLDocument.swift
    libxmlHTMLNode.swift
    libxmlParserOption.swift
    Modules
  2. In the target settings add $(SDKROOT)/usr/include/libxml2 to the Search Paths > Header Search Paths field
  3. In the target settings add $(SRCROOT)/Modules to the Swift Compiler - Search Paths > Import Paths field

Installation for swift 4

Installation for swift 3

Synopsis

import Kanna

let html = "<html>...</html>"

if let doc = try? HTML(html: html, encoding: .utf8) {
    print(doc.title)
    
    // Search for nodes by CSS
    for link in doc.css("a, link") {
        print(link.text)
        print(link["href"])
    }
    
    // Search for nodes by XPath
    for link in doc.xpath("//a | //link") {
        print(link.text)
        print(link["href"])
    }
}
let xml = "..."
if let doc = try? Kanna.XML(xml: xml, encoding: .utf8) {
    let namespaces = [
                    "o":  "urn:schemas-microsoft-com:office:office",
                    "ss": "urn:schemas-microsoft-com:office:spreadsheet"
                ]
    if let author = doc.at_xpath("//o:Author", namespaces: namespaces) {
        print(author.text)
    }
}

Donation

If you like Kanna, please donate via GitHub sponsors or PayPal.
It is used to improve and maintain the library.

License

The MIT License. See the LICENSE file for more information.

GitHub

https://github.com/tid-kijyun/Kanna