1 ===========================
\r
2 KANJIDIC2 Database Schema
\r
3 ===========================
\r
15 * character # what is meaning of * after list of child elements? (a, b, c, ...)*
\r
41 CREATE TABLE header (file_version TEXT, database_version TEXT, date_of_creation TEXT);
\r
43 character: -> CharacterTable
\r
44 id INTEGER, literal TEXT, grade INTEGER, freq INTEGER, jlpt INTEGER
\r
46 codepoint, radical, variant: -> TypeValueTable
\r
47 id INTEGER, fk INTEGER, type TEXT, value TEXT
\r
49 stroke_count: -> StrokeCountTable
\r
50 id INTEGER, fk INTEGER, count INTEGER
\r
52 rad_name, nanori: -> KeyValueTable
\r
53 id INTEGER, fk INTEGER, value TEXT
\r
55 dic_number: (First revision)
\r
56 id INTEGER, fk INTEGER, type TEXT, m_vol TEXT, m_page TEXT
\r
57 - Might remove moro pieces and put in separate table to save space...
\r
58 - Alternatively could merge m_vol/m_page into the moro code directly
\r
61 id INTEGER, fk INTEGER, type TEXT, skip_misclass TEXT, value TEXT
\r
62 - Might move skip miscodes into a separate table...
\r
63 - Might re-code SKIP values separately based upon integers... but
\r
64 the same could be argued about all other codes... probably
\r
65 shouldn't waste time on this.
\r
68 id INTEGER, fk INTEGER
\r
71 id INTEGER, fk INTEGER, type TEXT, on_type TEXT, r_status TEXT, value TEXT
\r
74 id INTEGER, fk INTEGER, lang TEXT, value TEXT
\r
79 - Initial tests show indices to be seemingly unnecessary for KANJIDIC2
\r
80 reading/meaning/nanori searches. I do not know if this is true on
\r
81 slower systems, but for the time being the default "%xxx%" pattern
\r
84 - KANJIDIC2 uses special notation for kunyomi readings.
\r
85 Specifically, - is used to note prefixes/suffixes, and . is used to
\r
88 Likely it would be beneficial to make a special index table for
\r
89 looking up kunyomi readings quickly. The prefix/suffix marker
\r
90 doesn't cause results to be dropped, but hte okurigana marker does,
\r
91 and the user shouldn't need to supply (or know about) such details.
\r