-
November 13th, 2021, 21:14 #1
Looking for programming advice from experienced programmers
I am looking for advice about writing code. The task is to write a parser for a typical NPC entry, such as the mock up provided below. The parser reads in the data form a file, parses the lines, extracts the relevant information, and then saves the information in a data structure.
The task becomes interesting after the stats. The following lines about resistances and immunities may or may not be present, and they may be split over multiple lines. I handled this with a huge mess of nested if statements, and a state machine.
So for example, after parsing the Damage Resistances...
1. Set state to "Damage Immunities"
2. read in next line
3. parse line as follows
if first two words are "Damage Immunities" then parse damage immunities
if first two words are "Condition Immunities" then parse condition immunities and set state to "Condition Immunities"
if first word is "Senses" then parse senses and set sate to "Languages"
if none of the above then check the state to determine what type of immunity, and parse immunity, then change state to next type of immunity
etc.
I have not even considered the cases that are missing from this NPC entry, such as Damage Vulnerabilities.
This approach gets the job done but it is ugly. Is there a more elegant way?
Thanks.
Code:Scary Thing Medium undead, chaotic evil Armor Class 120 Hit Points 220 Speed 0 ft., fly 50 ft. (hover) STR 1 (-5) DEX 14 (+2) CON 11 (+0) INT 10 (+0) WIS 10 (+0) CHA 11 (+0) Damage Resistances Acid, Cold Damage Immunities Necrotic, Poison Condition Immunities Charmed Senses Darkvision 60 ft. Languages Understands - Challenge 25 Incorporeal Movement. The scary thing can move through objects. Actions Scary Touch. Melee Spell Attack: +4 to hit, reach 5 ft., one creature. Hit: 10 (3d6) necrotic damage.
-
November 13th, 2021, 21:26 #2
After writing that I put some thought into it.
I need to be able to determine two things.
1. Is the current line of data a continuation of the previous line?
2. Which function do I use to parse the line of data?
If I could do this, then I could write a program that reads in lines of data, puts together split lines, determines which function to parse the lines with, and then creates a list of unsplit lines and their parsing functions. This approach is much more generic so it should look better after written.
The only fancy thing I need is first class functions. It seems that the language I use can do that. Yeah!
Any other advice from a pro?Last edited by spoofer; November 13th, 2021 at 21:28.
-
November 14th, 2021, 00:08 #3
- Join Date
- May 2016
- Posts
- 521
After reading your post again, I think I can't answer this question without additional information. It sounds like this is not really a FGU / Lua question. Either way, if the language you are using has regular expressions that would be the approach I would begin looking at. You also would likely need to look at this on more than a line by line basis.
If this is lua then the way I deal with this in my extensions looks something like
Code:local _,_,sSpeed = string.find(sRemainder,"<p>Speed[%s:]*(.-)</p>"); if sSpeed then window.t5_speed.setValue(sSpeed); end local _,_,sDmgVuln = string.find(sRemainder,"<p>Damage Vulnerabilities[%s:]*(.-)</p>"); if sDmgVuln then window.t5_dvuln.setValue(sDmgVuln); end local _,_,sDmgResist = string.find(sRemainder,"<p>Damage Resistances[%s:]*(.-)</p>"); if sDmgResist then window.t5_dresist.setValue(sDmgResist); end local _,_,sDmgImmune = string.find(sRemainder,"<p>Damage Immunities[%s:]*(.-)</p>"); if sDmgImmune then window.t5_dimmun.setValue(sDmgImmune); end local _,_,sCondImmune = string.find(sRemainder,"<p>Condition Immunities[%s:]*(.-)</p>"); if sCondImmune then window.t5_cimmun.setValue(sCondImmune); end
JasonLast edited by jharp; November 14th, 2021 at 06:07. Reason: added sample code
-
November 14th, 2021, 05:27 #4
My two questions to you are:
- What language are you programming in?
- What is the format of the text you are parsing (e.g. XML, JSON, etc.)?
-
November 14th, 2021, 13:21 #5
The language I have been using is Python.
I am thinking of writing the next one in Lua, so that I can expand my knowledge. I doubt it, but maybe someday I will work on extensions, etc. for FG.
The data is .txt. Does the data format change the problem? It is easy enough to change the format of the data.
Jason, that is exactly the same coding design that I use... a long series of if statements. I like what you did with the regex expressions. I was first testing for a positive result from the expressions, and then extracting the data. So I was executing the same regex twice. You only do so once.
I think that a mess of procedural programming spaghetti is perhaps the only way. To show just how messy, here is a code snip from my most resent NPC parser. The if statements go four layers deep. Yuck!
Code:elif task == "immunity": if line_data == "ATTACK OPTIONS": #immunity section is empty attack_dictionary = "nattack" task = "attacks" elif line_data[0:6] == "Immune": #immunities npc.immunity = line_data.replace( "Immune ", "") else: if line_data[0] == line_data[0].upper() and line_data[0:8] != "Insanity": #found a trait trait_name = getAttackName( line_data ) if trait_name: current_trait = trait_name npc.traits [ current_trait ] = line_data.replace( current_trait + " ", "" ) task = "traits" else: raise "Cannot parse trait name" else: #not a new trait, so must be continuation of immunities npc.immunity = npc.immunity + " " + line_data
-
November 14th, 2021, 14:36 #6
- Join Date
- May 2016
- Posts
- 521
Ultimately if your method works for you that is all that matters. However, I find the line-by-line approach odd. Since this is a stat block, my approach would be to find the beginning and end of the stat block. Store the entire found stat block in a string. That way I can perform repeated finds within that stat block for matching criteria. Yes, there is more overhead but who cares (unless you do). This avoids a lot of the if/elseif structure. The code becomes more single level.
You are then simply performing a bunch of searches for each piece you want. You can even have multiple regex for the same piece you want, first fails you try the backup until you give up on the piece. At the end you can then just put all the found pieces where you want them.
Again, this is my approach and no approach is "bad" if it works.
Jason
-
November 14th, 2021, 15:06 #7
That is a really interesting idea. That is the sort of advice I was looking for. Thank you so much for that. Now I am glad I asked. Perhaps there will be problems (how to maintain the paragraph breaks in the descriptive text), but that is an idea that I can think about the next time I approach a similar problem.
-
November 14th, 2021, 15:20 #8
- Join Date
- May 2016
- Posts
- 521
Another thing I have found that helps is a cleanup pass. I find all the main headers in the first pass and make certain they start the line. So if there is no newline before a header I make certain there is. That way I can find the end of an area by finding a newline (or other marker).
Jason
-
November 14th, 2021, 22:36 #9
Lesser Deity
- Join Date
- Mar 2006
- Location
- Arkansas
- Posts
- 7,397
I would like to point out that in FG there isn't really a specific order XML elements have to be in so it's a bit unsafe to assume monster data is in any particular order. This is somewhat masked by FG's XML parser putting the elements in alpha order on write to disk.
I would suggest you try to avoid deeply nested if's. Normally, you should be able to use a ladder of if, else if's using AND and ORs instead. This is a C# function I wrote to parse a single monster for 5E. The function itself is called in a loop and I'm using a "smart" XML reader here that understands XML elements and attributes.
Code:private void ReadMonster(XmlReader xml) { string monsterElement = xml.Name; rtbProcessDisplay.AppendText("\nProcessing " + monsterElement + "... "); while (xml.Read()) { xml.MoveToContent(); if (xml.NodeType == XmlNodeType.Element && xml.Name == "abilities") ReadMonsterAbilites(xml); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "ac") monster.ac = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "actext") monster.acText = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "actions") ReadElements(xml, monster.actions, "actions"); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "alignment") monster.alignment = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "conditionimmunities") monster.conditionImmunities = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "cr") monster.cr = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "damageimmunities") monster.damageImmunites = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "damageresistances") monster.damageResistances = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "hd") monster.hd = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "hp") monster.hp = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "innatespells") ReadElements(xml, monster.innateSpells, "innatespells"); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "lairactions") ReadElements(xml, monster.lairActions, "lairactions"); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "languages") monster.languages = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "legendaryactions") ReadElements(xml, monster.legendaryActions, "legendaryactions"); // Not reading locked - will lock them all on build else if (xml.NodeType == XmlNodeType.Element && xml.Name == "name") monster.name = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "reactions") ReadElements(xml, monster.reactions, "reactions"); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "savingthrows") monster.savingthrows = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "senses") monster.senses = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "size") monster.size = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "skills") monster.skills = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "speed") monster.speed = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "spells") ReadElements(xml, monster.spells, "spells"); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "text") monster.text = xml.ReadInnerXml(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "token") monster.token = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "traits") ReadElements(xml, monster.traits, "traits"); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "type") monster.type = xml.ReadElementString(); else if (xml.NodeType == XmlNodeType.Element && xml.Name == "xp") monster.xp = xml.ReadElementString(); if (xml.NodeType == XmlNodeType.EndElement && xml.Name == monsterElement) { break; } } }
This particular function reads the XML and loads a monster data structure with the monster data being added to a list of monsters.Last edited by Griogre; November 14th, 2021 at 22:39.
-
November 15th, 2021, 00:06 #10
Having an XML Parser, especially one that uses XPath syntax, would be very useful here.
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks