 |
Tanuki Novice
Joined: 06 Nov 2008 Posts: 38
|
Posted: Thu Nov 06, 2008 11:42 pm
Need Help with Expression |
Hey Everyone,
I need help with the following expression. I'm trying to parse the following information:
Code: |
No. Mob name Level Area Name Killed
--- ------------------------- ----- -------------------- ------
1 - A serving girl 80 Wedded Bliss 750
2 - An arctic spider 80 Earth Plane 4 109 |
The ReEx: ^%s(%d)%s-%s(%*)%s(%d)%s(%*)%s(%d)$ works fine on No. 1 but messes up on No. 2. and returns:
%1 6
%2 An arctic spider 80 Earth Plane
%3 4
%4
%5 109
Thank you.
Tanuki
Note: The Forum seems to have messed with the spaces of the example.... there needs to be a space before first digit and no space after the last for the regex to "work". |
|
|
 |
aplayer Newbie
Joined: 04 Oct 2008 Posts: 2
|
Posted: Fri Nov 07, 2008 12:20 am |
I put your pattern in to a new session and after adding the space at the beginning of your examples, they both worked well.
One suggestion I have is to remove the "%" from "%*". It's just an "*" by itself. Though it didn't seem to matter when I was testing
So it would be:
^%s(%d)%s-%s(*)%s(%d)%s(*)%s(%d)$
For the first one it outputted:
%1 : 1
%2 : A serving girl
%3 : 80
%4 : Wedded Bliss
%5 : 750
For the second one it outputted:
%1 : 2
%2 : An arctic spider
%3 : 80
%4 : Earth Plane 4
%5 : 109
If it's still not working after that, then maybe deleting the setting and pasting the pattern from above will help. |
|
|
 |
Tanuki Novice
Joined: 06 Nov 2008 Posts: 38
|
Posted: Fri Nov 07, 2008 12:31 am |
I removed both of the % like you suggest but still get the funky results. What version of CMud are you running? I'm running 2.37 from Oct, 31st.
|
|
|
 |
Caled Sorcerer
Joined: 21 Oct 2000 Posts: 821 Location: Australia
|
Posted: Fri Nov 07, 2008 12:52 am |
Use ([A-z ]) instead of the first (*)
Then it can't match the number. If you find you need to match any punctuation, just add the commas or whatever to the range. Make sure you keep the space in the range though.
Edit:
Just to be clear:
^%s(%d)%s-%s([A-z ])%s(%d)%s([A-z ])%s(%d)$ |
|
_________________ Athlon 64 3200+
Win XP Pro x64 |
|
|
 |
Fang Xianfu GURU

Joined: 26 Jan 2004 Posts: 5155 Location: United Kingdom
|
Posted: Fri Nov 07, 2008 1:11 am |
I'm seeing your incorrect captures - note that it only captures badly when the area name contains a number. But if it were me, I'd use a fourth pattern:
^\s+(\d+)\s+-\s+(.+?)\s+(\d+)\s+(.+?)\s+(\d+)$
Which is a regex, not a zScript pattern. The reason it needs to be a regex is so you can use a non-greedy modifier +? to stop all the backtracking that's going to slow down the matching of this trigger. Greedy modifiers are also what's causing the incorrect matching, so it has the happy side-effect of solving that problem too. |
|
|
 |
Tanuki Novice
Joined: 06 Nov 2008 Posts: 38
|
Posted: Fri Nov 07, 2008 1:43 am |
Fang Xianfu, your solution works great!
To Fang Xianfu and everyone else, Thank you all for your help.
Tanuki |
|
|
 |
Caled Sorcerer
Joined: 21 Oct 2000 Posts: 821 Location: Australia
|
Posted: Fri Nov 07, 2008 6:27 am |
The range would have worked too. I'm only pointing that out for anyone that (like me) has this problem with * fairly often, but doesn't have a clue how to duplicate the regex solution without asking for it here first.
The regex solution is probably better, but this can be done with zscript. |
|
_________________ Athlon 64 3200+
Win XP Pro x64 |
|
|
 |
Fang Xianfu GURU

Joined: 26 Jan 2004 Posts: 5155 Location: United Kingdom
|
Posted: Fri Nov 07, 2008 11:29 am |
The trouble with using a range is when the odd area uses a character you haven't anticipated - an apostrophe, for example. The only difference between my regex and the original pattern is that I'm using .+? instead of .+ (the * character in zScript patterns are converted into .+ when the regex is compiled, you can see this on the trigger testing tab).
All the question mark does is control how the engine does backtracking. Normally, a modifier like + is greedy - the engine matches as many characters as it can, and then removes characters from the .+ until a match is made. That's why the example in the first post captures more than it should. The question mark changes the modifier to non-greedy, which instead matches only one character at first, and then characters are added until a match is made. You can read about this in more detail here. |
|
|
 |
|
|