IPB

Welcome Guest ( Log In | Register )

 
Reply to this topicStart new topic
> SayRegexp Problem, Non fixed widths inside lookaround
stevehero
post Jul 1 2017, 14:36
Post #1


Member


Group: Full Members
Posts: 837
Joined: 3-December 10
From: Ireland
Member No.: 13334
Mp3tag Version: 2.84



Using this regexp: (for websource parsing)
CODE
sayregexp "(?<=\"artists\" data-aid=\"(\d+)\">)[^<]+ ", " "trk-list-cont"


Code for testing (using mt_ttt.png)
CODE
(?<=\"artists\" data-aid=\"(\d+)\">)[^<]+


I'm trying to seperate the artists from the following code example.
CODE
<a href="/artist/artist 1" class="artists" data-aid="392805">Artist 1</a><a href="/artist/artist 2" class="artists" data-aid="3805">Artist 2</a>trk-list-cont


To result in:
CODE
Artist 1, Artist 2


As you can see the digits are the non fixed width. Only it returns an Invalid lookbehind error. Which is down to the regexp language used by Mp3tag (I think), so.. .

1. Any workaround for this?
2. What regexp language does Mp3tag use as .NET regexp can take advantage of non fixed widths inside lookarounds.



--------------------
Go to the top of the page
 
+Quote Post
Florian
post Jul 1 2017, 19:13
Post #2


Developer


Group: Admin
Posts: 8106
Joined: 12-December 01
From: Germany, Dresden
Member No.: 203
Mp3tag Version: 2.85a



1. Maybe KillTag can be of help here?
CODE
killtag "a"
killtag "/a" ", "


2. Mp3tag uses boost::regex which doesn't support non-fixed widths inside lookbehind.

Kind regards
– Florian


--------------------
♫ If you like using Mp3tag please donate to support further development.

Go to the top of the page
 
+Quote Post
ms6676749
post Jul 2 2017, 10:35
Post #3


Member


Group: Full Members
Posts: 52
Joined: 17-March 16
Member No.: 21892
Mp3tag Version: 2.75



Or maybe do a regexpreplace for the data-aid value to make them all uniform if you don't need those identifiers anymore. Could be numbers or letters, whatever you choose. Here, I'm going to swap those digits with a string of zeroes:

regexpreplace "(class=\"artists\" data-aid=\")\d+" "${1}00000"

CODE
<a href="/artist/artist 1" class="artists" data-aid="392805">Artist 1</a><a href="/artist/artist 2" class="artists" data-aid="3805">Artist 2</a>trk-list-cont


to

CODE
<a href="/artist/artist 1" class="artists" data-aid="00000">Artist 1</a><a href="/artist/artist 2" class="artists" data-aid="00000">Artist 2</a>trk-list-cont


Now, you have don't have non-fixed lengths to worry about.

This post has been edited by ms6676749: Jul 2 2017, 10:41
Go to the top of the page
 
+Quote Post
stevehero
post Jul 2 2017, 11:27
Post #4


Member


Group: Full Members
Posts: 837
Joined: 3-December 10
From: Ireland
Member No.: 13334
Mp3tag Version: 2.84



QUOTE (Florian @ Jul 1 2017, 19:13) *
1. Maybe KillTag can be of help here?
CODE
killtag "a"
killtag "/a" ", "


2. Mp3tag uses boost::regex which doesn't support non-fixed widths inside lookbehind.

Kind regards
– Florian

Thanks. I've never actually had to use killtag before. I'll try see if I can make use of it. I've used the solution as below.

I suppose there's no furture plans to migrate to .NET regex?


QUOTE (ms6676749 @ Jul 2 2017, 10:35) *
Or maybe do a regexpreplace for the data-aid value to make them all uniform if you don't need those identifiers anymore. Could be numbers or letters, whatever you choose. Here, I'm going to swap those digits with a string of zeroes:

regexpreplace "(class=\"artists\" data-aid=\")\d+" "${1}00000"

CODE
<a href="/artist/artist 1" class="artists" data-aid="392805">Artist 1</a><a href="/artist/artist 2" class="artists" data-aid="3805">Artist 2</a>trk-list-cont


to

CODE
<a href="/artist/artist 1" class="artists" data-aid="00000">Artist 1</a><a href="/artist/artist 2" class="artists" data-aid="00000">Artist 2</a>trk-list-cont


Now, you have don't have non-fixed lengths to worry about.

Thanks. Yeah, that's exactly what I've been doing for years. Just wanted to see if there was a headon solution.

My solution: (is to enable fixed length lookbehind)
CODE
regexpreplace "(\"artists\" data-aid=\")\d+" "$1albumartist_fix"


This post has been edited by stevehero: Jul 2 2017, 11:38


--------------------
Go to the top of the page
 
+Quote Post
ms6676749
post Jul 4 2017, 19:53
Post #5


Member


Group: Full Members
Posts: 52
Joined: 17-March 16
Member No.: 21892
Mp3tag Version: 2.75



QUOTE (stevehero @ Jul 2 2017, 11:27) *
Thanks. Yeah, that's exactly what I've been doing for years. Just wanted to see if there was a headon solution.

My solution: (is to enable fixed length lookbehind)
CODE
regexpreplace "(\"artists\" data-aid=\")\d+" "$1albumartist_fix"

Yeah, yours makes the most sense when targeting %albumartist%.

This post has been edited by ms6676749: Jul 4 2017, 19:53
Go to the top of the page
 
+Quote Post

Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 23rd November 2017 - 08:29