I want to change the meaning of '<', '>' which is separated by peg.js depending on the Parser situation.

in peg.js I wrote the following parser

per.pegjs

Start
= c:(Content+)EOL {
  return c;
}

Content 
= openId:OpenTag:(Content) + closeId:CloseTag{
    if(openId!==closeId){
        through new Error("expect</"+openId+">but</"+closeId+">";
    }
    return { type: 'element', id:openId, content:c};
}
/txt:ContentText{
  return { type: 'txt', content: txt.trim()}
}


ContentText=[^<>\n] + {return text();}

OpenTag="<"id:[0-9]+">"{return parseInt(id.join(')))}
CloseTag="</"id:[0-9]+">"{return parseInt(id.join(')))}


EOL=[\n]*

Receive the following input and spit out json

<1>abc</1><2>def<3>ghi</3>>/2>

output

[
   {
      "type": "element",
      "id"—1,
      "content": [
         {
            "type": "txt",
            "content": "abc"
         }
      ]
   },
   {
      "type": "element",
      "id"—2,
      "content": [
         {
            "type": "txt",
            "content": "def"
         },
         {
            "type": "element",
            "id"—3,
            "content": [
               {
                  "type": "txt",
                  "content": "ghi"
               }
            ]
         }
      ]
   }
]

This parser cannot contain <,> in ContentText, but
If I want to include it somehow, how should I parser it?

javascript

2022-09-29 22:54

1 Answers

(Posting self-answer from questioner as wiki)

I have solved it myself, so how can I summarize the corrections?
Please let me know if there is a smarter way.

Start=c:(Content+)EOL{
    return c;
}

Content=open:OpenTag:Content+close:CloseTag{
   return { type: 'element', id:open, content:c};
}
/
txt:Text{
  return { type: 'txt', content: txt.trim()}
}

Text=txt:(NotOpenTag/NotCloseTag/NotTagNotEOL) + {returntxt.join(').trim();}

NotTagNotEOL = [^<>\n] {return text();}

NotOpenTag="<!"Digit!"/"{return text();}
/!Digit">"{return text();}

NotCloseTag="</"!Digit {return text();}

OpenTag="<"id:Digit">"{returnid;}
CloseTag="</"id:Digit">"{returnid;}

Digit = [0-9] + {return parseInt(text()));}

EOL=[\n]*

2022-09-29 22:54

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656