What Is Rosetta?
Rosetta is a universal multilingual encoding for plain texts
having a number of unique features:
-
Rosetta is a proper superset of ASCII;
-
Any number of languages can be used within a single document;
-
Efficiency of encoding is comparable to that of the most compact
national octet-based codes;
-
Generic algorithms for case conversion and sorting are supported,
letters of every alphabet are encoded in strictly alphabetical
order;
-
No external information is required for interpretation or rendition
of Rosetta-encoded texts;
-
Computing systems or application programs not having any support for
some particular
language can still perform processing (such as indexing or sorting)
of Rosetta texts containing words or phrases in that language;
-
Text can be interpreted starting from any point without having to
retrieve large numbers of preceding characters;
-
Texts can be enhanced with optional reading hints (such as vowels
in Hebrew, or stress accents in Russian) which will not affect
processing of the text (i.e. words with hints are identical to
words without hints for the purpose of machine processing;
-
Practically unlimited number of languages and characters is supported
without reduction in efficiency of encoding;
-
A computing system can easily identify and skip or transliterate
parts of text in languages not supported by that system;
-
This encoding was not designed by a committee.
Unlike Unicode, Rosetta allows
meaningful text processing by
third-party computing systems (such as Internet search engines, etc),
and produces much smaller files (a Unicode text can be as much as
200% bigger than the same text encoded with Rosetta).
BACK TO ROSETTA HOME PAGE