Maximal Starting Repertoire
For Generation Panels to start their work the Integration Panel has created the maximal set of code points for the root zone under the Procedure to Develop and Maintain Label Generation Rules (LGR) for the Root Zonein Respect to IDN Labels [PDF, 772 KB], called the Maximal Starting Repertoire (MSR). MSR may be updated by the Integration Panel, based on feedback from the community and to accommodate relevant updates in the Unicode standard. For sending feedback on the latest version of MSR, send email to IDNProgram@icann.org.
MSR-5 is the current version of the MSR covering 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Tibetan and Thai. MSR-5 shortlists 33,515 code points shorted from the Unicode version 11.0.
- MSR-5-Overview and Rationale [PDF, 1.1 MB]
- MSR-5 (HTML [13.6 MB], XML [861 KB])
- MSR-5-Annotated-Hangul-Tables [2 MB]
- MSR-5-Annotated-Han-Tables [41.8 MB]
- MSR-5-Annotated-non-CJK-Tables [3.3 MB]
Earlier Versions of Maximal Starting Repertoire
MSR- 4 was released on 7 February 2019, covering 28 scripts, already included in MSR- 3: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Tibetan and Thai. MSR- 4 contains 33,511 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.
- MSR-4-Overview and Rationale [PDF, 800 KB]
- MSR-4 (HTML [13.2 MB], XML [840 KB])
- MSR-4-Annotated-Hangul-Tables [1.09 MB]
- MSR-4-Annotated-Han-Tables [39.1 MB]
- MSR-4-Annotated-non-CJK-Tables [2.16 MB]
MSR-3 was released on 29 March 2018, covering the following 28 scripts, already included in MSR-2: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Tibetan and Thai. MSR-3 contains 33,496 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.
- MSR-3-Overview and Rationale [PDF, 1.1 MB]
- MSR-3 (HTML [12.5 MB], XML [838 KB])
- MSR-3-Annotated-Hangul-Tables [PDF, 1.09 MB]
- MSR-3-Annotated-Han-Tables [PDF, 39 MB]
- MSR-3-Annotated-non-CJK-Tables [PDF, 2.15 MB]
MSR-2 was released on 27 April 2015, covering the following 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Tibetan and Thai. MSR-2 contains 33,490 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.
MSR-2 release consists of the following documents:
- Maximal Starting Repertoire - MSR-2-Overview and Rationale-20150414 [PDF, 727 KB]
- MSR-2-Annotated-Han-Tables-20150413 [PDF, 43.3 MB]
- MSR-2-Annotated-Hangul-Tables-20150413 [PDF, 4.18 MB]
- MSR-2-Annotated-non-CJK-Tables-20150413 [PDF, 2.41 MB]
- MSR-2-Repertoire+WLE-Rules-20150413 [XML, 745 KB]
MSR-1 was released on 20 June 2014, covering the following 22 scripts: Arabic, Bengali, Cyrillic, Devanagari, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Lao, Latin, Malayalam, Oriya, Sinhala, Tamil, Telugu, and Thai. MSR-1 contains 32,790 code points short-listed from 97,973 PVALID/CONTEXT code points of Unicode version 6.3.
MSR-1 release consists of the following documents:
- Maximal Starting Repertoire - MSR-1-Overview and Rationale-20140606 [PDF, 477 KB]
- MSR-1-Annotated-Han-Tables-20140606 [PDF, 43.3 MB]
- MSR-1-Annotated-Hangul-Tables-20140606 [PDF, 4.19 MB]
- MSR-1-Annotated-non-CJK-Tables-20140606 [PDF, 1.86 MB]
- MSR-1-Repertoire+WLE-Rules-20140606 [XML, 741 KB]