EBNF解析用に書き換えた組み込みパーサをスプレッドシートに移植できたので、次はスプレッドシートの数式の文法をENBF風に書いてみると
(* セルのデータ *)
cell ::= expr_cell | value_cell ;
(* 値 *)
value_cell ::= time_stamp | date | time | logical_expr | number | boolean | text_etc ;
time_stamp ::= date ' ' time ;
date ::= yyyy '/' MM '/' dd | MM '/' dd ;
yyyy ::= /\\d{4}/ ;
MM ::= /\\d{1,2}/ ;
dd ::= /\\d{1,2}/ ;
time ::= HH ':' mm ':' ss | HH ':' mm ;
HH ::= /\\d{1,2}/ ;
mm ::= /\\d{1,2}/ ;
ss ::= /\\d{1,2}/ ;
number ::= /[-]?([0-9]+)([.][0-9]*)?/ ;
boolean ::= /true/i | /false/i ;
text_etc ::= /.+/ ;
(* 数式 *)
expr_cell ::= '=' expr ;
expr ::= logical_expr ;
logical_expr ::= add_sub_expr [ ( '=' | '>' | '<' | '>=' | '<=' | '<>' ) add_sub_expr ] ;
add_sub_expr ::= mul_div_expr { ( '+' | '-' | '&' ) mul_div_expr }* ;
mul_div_expr ::= factor { ( '*' | '/' ) factor } ;
factor ::= value | parenthesis_expr ;
parenthesis_expr ::= '(' expr ')' ;
value ::= number | boolean | function_def | a1_range | a1 | dqstring ;
a1_range ::= a1 ':' a1 ;
a1 ::= /([A-Za-z]+[1-9][0-9]*)/ ;
dqstring ::= /".*"/ ;
function_def ::= symbol '(' parameters ')' ;
symbol ::= /([A-Za-z][A-Za-z0-9_]*)/ ;
parameters ::= expr { ',' expr } ;
ここでoption( […])の中身をsequenceとしているので、EBNF解析パーサもchoiceからlistに変更してドッチでも良しのつもりだったが、”(…) 値 ”の様な場合は、(…)でchoice様に当確判定が出てしまい(値)ではなく即(])判定に走ってミスとなったので、sequenceに変えた。※EBNFのlist を sequence | choice; に変えようかなとも思ったがchoiceが一切通らなくなる様な気がする。
よくよく考えてみると文法の定義(defintion)文はsequenceが基本で、文法上sequenceを明示する書式がなかったりする。「,」で区切れば可能だが、何かありそう。
{
"cell": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "expr_cell" }
]
},
{
"_sequence": [
{ "_identifier": "value_cell" }
]
}]
}}},
{
"value_cell": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "time_stamp" }
]
},
{
"_sequence": [
{ "_identifier": "date" }
]
},
{
"_sequence": [
{ "_identifier": "time" }
]
},
{
"_sequence": [
{ "_identifier": "logical_expr" }
]
},
{
"_sequence": [
{ "_identifier": "number" }
]
},
{
"_sequence": [
{ "_identifier": "boolean" }
]
},
{
"_sequence": [
{ "_identifier": "text_etc" }
]
}]
}}},
{
"time_stamp": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "date" },
{ "_sq_string": " " },
{ "_identifier": "time" }
]
}]
}}},
{
"date": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "yyyy" },
{ "_sq_string": "/" },
{ "_identifier": "MM" },
{ "_sq_string": "/" },
{ "_identifier": "dd" }
]
},
{
"_sequence": [
{ "_identifier": "MM" },
{ "_sq_string": "/" },
{ "_identifier": "dd" }
]
}]
}}},
{
"yyyy": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\\d{4}/" }
]
}]
}}},
{
"MM": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\\d{1,2}/" }
]
}]
}}},
{
"dd": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\\d{1,2}/" }
]
}]
}}},
{
"time": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "HH" },
{ "_sq_string": ":" },
{ "_identifier": "mm" },
{ "_sq_string": ":" },
{ "_identifier": "ss" }
]
},
{
"_sequence": [
{ "_identifier": "HH" },
{ "_sq_string": ":" },
{ "_identifier": "mm" }
]
}]
}}},
{
"HH": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\\d{1,2}/" }
]
}]
}}},
{
"mm": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\\d{1,2}/" }
]
}]
}}},
{
"ss": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\\d{1,2}/" }
]
}]
}}},
{
"number": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/[-]?([0-9]+)([.][0-9]*)?/" }
]
}]
}}},
{
"boolean": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/true/i" }
]
},
{
"_sequence": [
{ "_regexp": "/false/i" }
]
}]
}}},
{
"text_etc": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/.+/" }
]
}]
}}},
{
"expr_cell": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_sq_string": "=" },
{ "_identifier": "expr" }
]
}]
}}},
{
"expr": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "logical_expr" }
]
}]
}}},
{
"logical_expr": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "add_sub_expr" },
{
"_option": [
{
"_sequence": [
{
"_group": [
{
"_sequence": [
{ "_sq_string": "=" }
]
},
{
"_sequence": [
{ "_sq_string": ">" }
]
},
{
"_sequence": [
{ "_sq_string": "<" }
]
},
{
"_sequence": [
{ "_sq_string": ">=" }
]
},
{
"_sequence": [
{ "_sq_string": "<=" }
]
},
{
"_sequence": [
{ "_sq_string": "<>" }
]
}]
},
{ "_identifier": "add_sub_expr" }
]
}]
}]
}]
}}},
{
"add_sub_expr": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "mul_div_expr" },
{
"_repeate0": [
{
"_sequence": [
{
"_group": [
{
"_sequence": [
{ "_sq_string": "+" }
]
},
{
"_sequence": [
{ "_sq_string": "-" }
]
},
{
"_sequence": [
{ "_sq_string": "&" }
]
}]
},
{ "_identifier": "mul_div_expr" }
]
}]
}]
}]
}}},
{
"mul_div_expr": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "factor" },
{
"_repeate0": [
{
"_sequence": [
{
"_group": [
{
"_sequence": [
{ "_sq_string": "*" }
]
},
{
"_sequence": [
{ "_sq_string": "/" }
]
}]
},
{ "_identifier": "factor" }
]
}]
}]
}]
}}},
{
"factor": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "value" }
]
},
{
"_sequence": [
{ "_identifier": "parenthesis_expr" }
]
}]
}}},
{
"parenthesis_expr": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_sq_string": "(" },
{ "_identifier": "expr" },
{ "_sq_string": ")" }
]
}]
}}},
{
"value": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "number" }
]
},
{
"_sequence": [
{ "_identifier": "boolean" }
]
},
{
"_sequence": [
{ "_identifier": "function_def" }
]
},
{
"_sequence": [
{ "_identifier": "a1_range" }
]
},
{
"_sequence": [
{ "_identifier": "a1" }
]
},
{
"_sequence": [
{ "_identifier": "dqstring" }
]
}]
}}},
{
"a1_range": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "a1" },
{ "_sq_string": ":" },
{ "_identifier": "a1" }
]
}]
}}},
{
"a1": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/([A-Za-z]+[1-9][0-9]*)/" }
]
}]
}}},
{
"dqstring": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/\".*\"/" }
]
}]
}}},
{
"function_def": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "symbol" },
{ "_sq_string": "(" },
{ "_identifier": "parameters" },
{ "_sq_string": ")" }
]
}]
}}},
{
"symbol": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_regexp": "/([A-Za-z][A-Za-z0-9_]*)/" }
]
}]
}}},
{
"parameters": {
"_definition": {
"_choice": [
{
"_sequence": [
{ "_identifier": "expr" },
{
"_repeate0": [
{
"_sequence": [
{ "_sq_string": "," },
{ "_identifier": "expr" }
]
}]
}]
}]
}}}
「パーサコンビネーションでの実装を諦める」方向に進みそうな雰囲気になってきたが、パーサコンビネーションの方がデバッグしやすいと云う妄想が消え去った後ではコレで良いのかもしれない。
正規表現のオプションの記述が抜けている方が気になるが、空白や改行は基本的に無視する数式文法なので問題は無いだろう。
完璧なテストパターンが欲しくなってきたが、まだまだ内部実装は変わりそうなので諦め。