Mustard alternatives and similar libraries
Based on the "Text" category.
Alternatively, view Mustard alternatives based on common mentions on social networks and blogs.
-
PhoneNumberKit
A Swift framework for parsing, formatting and validating international phone numbers. Inspired by Google's libphonenumber. -
ZSSRichTextEditor
A beautiful rich text WYSIWYG editor for iOS with a syntax highlighted source view -
Twitter Text Obj
Twitter Text Libraries. This code is used at Twitter to tokenize and parse text to meet the expectations for what can be used on the platform. -
FontAwesomeKit
Icon font library for iOS. Currently supports Font-Awesome, Foundation icons, Zocial, and ionicons. -
TwitterTextEditor
A standalone, flexible API that provides a full-featured rich text editor for iOS applications. -
RichEditorView
DISCONTINUED. RichEditorView is a simple, modular, drop-in UIView subclass for Rich Text Editing. -
SwiftyMarkdown
Converts Markdown files and strings into NSAttributedStrings with lots of customisation options. -
Atributika
Convert text with HTML tags, links, hashtags, mentions into NSAttributedString. Make them clickable with UILabel drop-in replacement. -
SwiftIconFont
Icons fonts for iOS (Font Awesome 5, Iconic, Ionicon, Octicon, Themify, MapIcon, MaterialIcon, Foundation 3, Elegant Icon, Captain Icon) -
NSStringEmojize
A category on NSString to convert Emoji Cheat Sheet codes to their equivalent Unicode characters -
Heimdall
Heimdall is a wrapper around the Security framework for simple encryption/decryption operations. -
AttributedTextView
Easiest way to create an attributed UITextView (with support for multiple links and from html)
CodeRabbit: AI Code Reviews for Developers

* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest.
Do you think we are missing an alternative of Mustard or a related project?
README
Mustard 🌭
Mustard is a Swift library for tokenizing strings when splitting by whitespace doesn't cut it.
Quick start using character sets
Foundation includes the String
method components(separatedBy:)
that allows us to get substrings divided up by certain characters:
let sentence = "hello 2017 year"
let words = sentence.components(separatedBy: .whitespaces)
// words.count -> 3
// words = ["hello", "2017", "year"]
Mustard provides a similar feature, but with the opposite approach, where instead of matching by separators you can match by one or more character sets, which is useful if separators simply don't exist:
import Mustard
let sentence = "hello2017year"
let words = sentence.components(matchedWith: .letters, .decimalDigits)
// words.count -> 3
// words = ["hello", "2017", "year"]
If you want more than just the substrings, you can use the tokens(matchedWith: CharacterSet...)
method which will return an array of TokenType
.
As a minimum, TokenType
requires properties for text (the substring matched), and range (the range of the substring in the original string). When using CharacterSets as a tokenizer, the more specific type CharacterSetToken
is returned, which includes the property set
which contains the instance of CharacterSet that was used to create the match.
import Mustard
let tokens = "123Hello world&^45.67".tokens(matchedWith: .decimalDigits, .letters)
// tokens: [CharacterSet.Token]
// tokens.count -> 5 (characters '&', '^', and '.' are ignored)
//
// second token..
// token[1].text -> "Hello"
// token[1].range -> Range<String.Index>(3..<8)
// token[1].set -> CharacterSet.letters
//
// last token..
// tokens[4].text -> "67"
// tokens[4].range -> Range<String.Index>(19..<21)
// tokens[4].set -> CharacterSet.decimalDigits
Advanced matching with custom tokenizers
Mustard can do more than match from character sets. You can create your own tokenizers with more
sophisticated matching behavior by implementing the TokenizerType
and TokenType
protocols.
Here's an example of using DateTokenizer
([see example for implementation](Documentation/Template%20tokenizer.md)) that finds substrings that match a MM/dd/yy
format.
DateTokenizer
returns tokens with the type DateToken
. Along with the substring text and range, DateToken
includes a Date
object corresponding to the date in the substring:
import Mustard
let text = "Serial: #YF 1942-b 12/01/17 (Scanned) 12/03/17 (Arrived) ref: 99/99/99"
let tokens = text.tokens(matchedWith: DateTokenizer())
// tokens: [DateTokenizer.Token]
// tokens.count -> 2
// ('99/99/99' is *not* matched by `DateTokenizer` because it's not a valid date)
//
// first date
// tokens[0].text -> "12/01/17"
// tokens[0].date -> Date(2017-12-01 05:00:00 +0000)
//
// last date
// tokens[1].text -> "12/03/17"
// tokens[1].date -> Date(2017-12-03 05:00:00 +0000)
Documentation & Examples
- [Greedy tokens and tokenizer order](Documentation/Greedy%20tokens%20and%20tokenizer%20order.md)
- [Token types and AnyToken](Documentation/Token%20types%20and%20AnyToken.md)
- [TokenizerType: implementing your own tokenizer](Documentation/TokenizerType%20protocol.md)
- [EmojiTokenizer: matching emoji substrings](Documentation/Matching%20emoji.md)
- [LiteralTokenizer: matching specific substrings](Documentation/Literal%20tokenizer.md)
- [DateTokenizer: tokenizer based on template match](Documentation/Template%20tokenizer.md)
- [Alternatives to using Mustard](Documentation/Alternatives%20to%20using%20Mustard.md)
- [Performance comparisons](Documentation/Performance%20Comparisons.md)
Roadmap
- [x] Include detailed examples and documentation
- [x] Ability to skip/ignore characters within match
- [x] Include more advanced pattern matching for matching tokens
- [x] Make project logo 🌭
- [x] Performance testing / benchmarking against Scanner
- [ ] Include interface for working with Character tokenizers
Requirements
- Swift 4.1
Author
Made with :heart: by @permakittens
Contributing
Feedback, or contributions for bug fixing or improvements are welcome. Feel free to submit a pull request or open an issue.
License
MIT
*Note that all licence references and agreements mentioned in the Mustard README section above
are relevant to that project's source code only.