Unicode Plus

Set of Unicode & emoji utilities wrapped into one single app.

Vade Mecum Shelf icon UNICODE PLUS

Unicode Plus is a set of Unicode & emoji utilities wrapped into one single app, built with Electron.

This app works on Mac OS X, Linux and Windows operating systems.

Utilities

The following utilities are currently available:

Emoji Data Finder

Find by Name

  • The Find by Name feature of the Emoji Data Finder utility displays a list of basic data (symbol, short name, keywords, code) of matching Unicode emoji searched by name or keyword, including through regular expressions.
  • After entering a query, clicking the Search button will display a list of all relevant matches, if any.
  • Fully-qualified (keyboard/palette) emoji are presented in a standard way, while non-fully-qualified (display/process) emoji are shown in a distinctive muted (grayed out) style.
  • This feature deals with the 3570 emoji defined in the Emoji 11.0 version of the emoji-test.txt data file; the 12 keycap bases and the 26 singleton Regional Indicator characters are not included.
  • Various examples of regular expressions are provided for quick copy-and-paste.

Emoji Data Finder - Find by Name screenshot

Match Symbol

  • The Match Symbol feature of the Emoji Data Finder utility displays a list of basic data (symbol, short name, keywords, code) of Unicode emoji matching a symbol, or a regular expression using Unicode properties.
  • After entering a query, clicking the Search button will display a list of all relevant matches, if any.
  • Fully-qualified (keyboard/palette) emoji are presented in a standard way, while non-fully-qualified (display/process) emoji are shown in a distinctive muted (grayed out) style.
  • This feature deals with the 3570 emoji defined in the Emoji 11.0 version of the emoji-test.txt data file; the 12 keycap bases and the 26 singleton Regional Indicator characters are not included.
  • Various examples of regular expressions are provided for quick copy-and-paste.

Emoji Data Finder - Match Symbol screenshot

Filter Text

  • The Filter Text feature of the Emoji Data Finder utility displays in real time a list of basic data (symbol, short name, keywords, code) of all the Unicode emoji contained in a text string.
  • Text can by directly typed or pasted from the clipboard into the main input field. Clicking on the Filter button strips out all non-emoji characters.
  • It is also possible to input predefined sets of emoji selected from the Samples ▾ pop-up menu.
  • As a convenience, the input field can be emptied using the Clear button.
  • Fully-qualified (keyboard/palette) emoji are presented in a standard way, while non-fully-qualified (display/process) emoji are shown in a distinctive muted (grayed out) style.
  • This feature deals with the 3570 emoji defined in the Emoji 11.0 version of the emoji-test.txt data file; the 12 keycap bases and the 26 singleton Regional Indicator characters are not included.

Emoji Data Finder - Filter Text screenshot

Emoji Picture Book

  • The Emoji Picture Book utility displays lists of Unicode emoji in a picture book fashion.
  • Any group of pictures can be displayed by selecting its name in the category pop-up menu, among:
    "Smileys & People", "Animals & Nature", "Food & Drink", "Travel & Places", "Activities", "Objects", "Symbols", "Flags".
  • The size of all emoji pictures (from 32 to 128 pixels) can be adjusted by moving the dedicated slider left and right.
  • The groups and subgroups of emoji are those defined in the Emoji 11.0 version of the emoji-test.txt data file; the 12 keycap bases and the 26 singleton Regional Indicator characters are not included.
  • Only the 2789 fully-qualified (keyboard/palette) encodings of the emoji are used unless they cannot be displayed properly, depending on the emoji support level of the operating system.
  • Emoji failing to be represented as proper pictures are purely and simply discarded.

Emoji Picture Book screenshot

Emoji References

  • The Emoji References utility provides a list of reference links to emoji-related web pages.

Emoji References screenshot

JavaScript Runner

  • The JavaScript Runner utility lets you execute JavaScript code, and comes with several sample scripts related to Unicode and emoji; it is useful for quick testing/prototyping or data processing.

JavaScript Runner screenshot

Regex Properties

  • The Regex Properties utility displays all the Unicode properties available in this app for regular expressions, used in particular by the Emoji Data Finder and Unicode Data Finder utilities.

  • These properties are suitable to build Unicode-aware regular expressions in JavaScript (ECMAScript 6) using the 'u' flag.

  • Unicode properties fall into four groups, which can be displayed individually using the Category pop-up menu:

    • General Category properties
    • Binary properties
    • Script properties
    • Script Extensions properties
  • For General Category properties, prefixing with General_Category= (Canonical) or gc= (Alias) is optional. Use the Optional Prefix checkbox to control whether the prefix is included or not.

  • Groupings:

    Property Description
    Cased_Letter Uppercase_Letter | Lowercase_Letter | Titlecase_Letter
    Letter Uppercase_Letter | Lowercase_Letter | Titlecase_Letter | Modifier_Letter | Other_Letter
    Mark Nonspacing_Mark | Spacing_Mark | Enclosing_Mark
    Number Decimal_Number | Letter_Number | Other_Number
    Punctuation Connector_Punctuation | Dash_Punctuation | Open_Punctuation | Close_Punctuation | Initial_Punctuation | Final_Punctuation | Other_Punctuation
    Symbol Math_Symbol | Currency_Symbol | Modifier_Symbol | Other_Symbol
    Separator Space_Separator | Line_Separator | Paragraph_Separator
    Other Control | Format | Surrogate | Private_Use | Unassigned
  • \P{…} is the negated form of \p{…}. Use the Negated checkbox to toggle between the two forms.

  • Notes:

    • \p{Any} is equivalent to [\u{0}-\u{10FFFF}]
    • \p{ASCII} is equivalent to [\u{0}-\u{7F}]
    • \p{Assigned} is equivalent to \P{Unassigned}
  • Information pertaining to this list has been gathered from several sources (see References), and slightly refined through trial and error.

Regex Properties screenshot

Unicode Data Finder

Find by Name

  • The Find by Name feature of the Unicode Data Finder utility displays a list of basic data (symbol, code point, name, block) of matching Unicode characters searched by name (or alias name), including through regular expressions.
  • After entering a query, clicking the Search button will display a list of all relevant matches, if any, ordered by code point value.
  • When available, name aliases are also displayed (in smaller typeface) after the unique and immutable Unicode name. A correction alias is indicated by a leading reference mark .
  • It is possible to choose how many characters are shown one page at a time.
  • The search is performed on the 276955 assigned characters (or code points) defined in the Unicode 11.0 version of the UnicodeData.txt data file.
  • Various examples of regular expressions are provided for quick copy-and-paste.

Unicode Data Finder - Find by Name screenshot

Match Symbol

  • The Match Symbol feature of the Unicode Data Finder utility displays a list of basic data (symbol, code point, name, block) of Unicode characters matching a symbol, or a regular expression using Unicode properties.
  • After entering a query, clicking the Search button will display a list of all relevant matches, if any, ordered by code point value.
  • It is possible to choose how many characters are shown one page at a time.
  • The search is performed on the 276955 assigned characters (or code points) defined in the Unicode 11.0 version of the UnicodeData.txt data file.
  • Various examples of regular expressions are provided for quick copy-and-paste.

Unicode Data Finder - Match Symbol screenshot

List by Block

  • The List by Block feature of the Unicode Data Finder utility displays in real time a list of basic data (symbol, code point, name, block) of Unicode characters belonging to the same block range.
  • It is possible to choose how many characters are shown one page at a time.
  • A block can be selected either by Block Range or by Block Name, as defined in the Unicode 11.0 version of the Blocks.txt data file.
  • It is also possible to directly enter a code point (or character) in the Specimen field, then click on the Go button to automatically select the block containing the code point, scroll its basic data into view, and highlight its hexadecimal code value.

Unicode Data Finder - List by Block screenshot

Unicode Inspector

  • The Unicode Inspector utility displays code point information in real time for each Unicode character of a text string.
  • Characters can be entered either directly in the "Characters" input field, or using a series of code points in hexadecimal format in the "Code Points" input field.
  • It is also possible to input predefined sets of characters selected from each Samples ▾ pop-up menu.
  • As a convenience, each input field can be emptied using the Clear button.
  • In output, the standard Unicode code point format U+0041 is used, i.e. "U+" directly followed by 4 or 5 hex digits.
  • In input, more hexadecimal formats are allowed, including Unicode escape sequences, such as \u611B and \u{1F49C}. Clicking on the Filter button converts all valid codes to standard Unicode code point format.
  • Information is provided for the 276955 assigned characters (or code points) defined in the Unicode 11.0 version of the UnicodeData.txt data file.
  • Extra information is also obtained from the following data files:

Unicode Inspector - Characters screenshot

Unicode Inspector - Code Points screenshot

Unicode References

  • The Unicode References utility provides a list of reference links to Unicode-related web pages.

Unicode References screenshot

Unihan References

  • The Unihan References utility provides a list of reference links to Unihan-related web pages.

Unihan References screenshot

Building

You'll need Node.js installed on your computer in order to build this app.

git clone https://github.com/tonton-pixel/unicode-plus
cd unicode-plus
npm install
npm start

If you don't wish to clone, you can download the source code.

Several scripts are also defined in the package.json file to build OS-specific bundles of the app, using the simple yet powerful Electron Packager Node module.
For instance, running the following command will create a Unicode Plus.app version for Mac OS X:

npm run build-darwin

Using

You can download the latest release for Mac OS X.

License

The MIT License (MIT).

Copyright © 2018 Michel MARIANI.

not_used

Something missing? Edit this app.