IPB

Welcome Guest ( Log In | Register )

> Notice!

Please take a minute to check our Frequently Asked Questions. Use Search to reveal possible related topics.

Also make sure you've read the Forum Guidelines before posting in this forum.

2 Pages V   1 2 >  
Reply to this topicStart new topic
> Too complicated to accomplish in Mp3tag?, using regexp
beaker
post Jan 21 2006, 19:59
Post #1


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



I'd like to do the following in Mp3tag using regexp for filenames only.

01. Split (tokenize) on the following characters: ~ ( ) { } [ ]
02. Ignore leading and trailing spaces for each token, but don't remove them
03. Always capitalize first and last word unless it's some specific word (tweaker, ohGr, etc)
04. Always captialize remaining words except for articles (a, an, the), conjunctions (and, or, but) and small prepositions (in, out, on, of, to, at)

In Mp3tag I was picturing creating a new action of type "Replace with regular expression". The action would apply to Field _FILENAME and the Regular Expression would return all the words that needed to be capitalized. "Replace matches with" would simply be the same expression with the first character capitalized.

I'm using The Regulator as a guide to step my way through this process. I'm just wanting to make sure I'm not on a fool's errand. Is something that can be accomplished or is it to invlolved?
Go to the top of the page
 
+Quote Post
dano
post Jan 22 2006, 01:09
Post #2


Moderator


Group: Moderators
Posts: 4180
Joined: 4-September 03
From: Germany
Member No.: 201
Mp3tag Version: 2.46b



Do you have an example for point 1 and 2 ?
I'm sure it can be done, but there's probably more than 1 action needed.


--------------------
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 01:35
Post #3


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



QUOTE (beaker @ Jan 21 2006, 18:59)
01. Split (tokenize) on the following characters: ~ ( ) { } [  ]
02. Ignore leading and trailing spaces for each token, but don't remove them


So here's an example filename:
CODE
aps ~ Reggae (2005) Matisyahu [Live At Stubb's{01}] Matisyahu ~ Sea To Sea

I want to treat each token as it's own "title"

So if I had the that filename, it would break the following segments

CODE
aps
Reggae
2005
Matisyahu
Live At Stubb's
01

Matisyahu
Sea To Sea

So, it's split on those characters and because of the wonders of html you can't see that there are spaces both leading and trailing that I'd like to ignore, but not remove. As I'm typing this I'm also realizing that there will be an empty token because of the Track # being included in the CD Name brackets. Need to make sure that that's not a problem as well.

So ultimately, I'd like to end up with the filename:

CODE
Aps ~ Reggae (2005) Matisyahu [Live at Stubb's{01}] Matisyahu ~ Sea to Sea


Namely, do not remove any of the spaces, and capitalize and lowercase (the first letter of each word only) as appropriate. I understand the naming convention will seem awkward to most, but this is used before doing my Filename To Tag conversion.

QUOTE
I'm sure it can be done, but there's probably more than 1 action needed.

If you could help me out dano, that would be great biggrin.gif
Go to the top of the page
 
+Quote Post
dano
post Jan 22 2006, 15:58
Post #4


Moderator


Group: Moderators
Posts: 4180
Joined: 4-September 03
From: Germany
Member No.: 201
Mp3tag Version: 2.46b



Here are two actions:

First puts everything in "first letter upper case, rest lower case" form. It uses space, ( [ { to define the first letter of a word.

The second action puts your prepositions to lower case if they are surrounded by spaces. (If you want i.e. [At also to become [at the you can add it by using (\s|\[)

Then you just need a third action for point 3 to put your special words in your desired spelling

This post has been edited by dano: Jan 22 2006, 16:03
Attached File(s)
Attached File  user_beaker.mta ( 161bytes ) Number of downloads: 135
 


--------------------
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 16:39
Post #5


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



QUOTE (dano @ Jan 22 2006, 14:58)
Here are two actions:

First puts everything in "first letter upper case, rest lower case" form. It uses space, ( [ { to define the first letter of a word.

The second action puts your prepositions to lower case if they are surrounded by spaces. (If you want i.e. [At also to become [at the you can add it by using (\s|\[)

Then you just need a third action for point 3 to put your special words in your desired spelling
*


sad.gif

For this file: dl ~ Reggae (2005) Matisyahu [Live At Stubb's{01}] Matisyahu ~ Sea To Sea.mp3 I get this error
QUOTE
Regular expression: \s(a|an|the|and|or|but|in|out|on|of|to|at)(?\u003d\s)
Invalid preceding regular expression

When I look at the new filename, I see it has capitalized it to: Dl ~ Reggae (2005) Matisyahu [Live At Stubb's{01}] Matisyahu ~ Sea To Sea.mp3

[EDIT]Please don't let my disappointment think I'm not appreciative. Thank you!! wink.gif

This post has been edited by beaker: Jan 22 2006, 16:42
Go to the top of the page
 
+Quote Post
dano
post Jan 22 2006, 16:49
Post #6


Moderator


Group: Moderators
Posts: 4180
Joined: 4-September 03
From: Germany
Member No.: 201
Mp3tag Version: 2.46b



Ok you have an older Mp3tag version.
the regular expression must be:
\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)
(replace \u003d wit =)


--------------------
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 17:30
Post #7


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



QUOTE (dano @ Jan 22 2006, 15:49)
Ok you have an older Mp3tag version.
the regular expression must be:
\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)
(replace \u003d wit =)
*

That fixed the error biggrin.gif Thank you!

A few more follow-up questions: Can this be updated to act on the first letter only? Meaning "McLachlan, Sarah" wouldn't get changed to "Mclachlan, Sarah". My big concern is that "Is" and "Be" are capitalized while "in" is not.

Here is an example of why I was thinking looking at phrases and not words (this isn't the real name of the song, just used as an illustration):
CODE
aps ~ Alternative (1996) Beck [Odelay{08}] Beck ~ Where It's At (Two Turntables and a Microphone).mp3


This would have the word "At" incorrectly lowercased. I was hoping the action would view a space and any of these characters ()[]{} (or reversed) as the same as being the first or last word. This I think is where the real trouble lies...

It seems like we're almost there. Thanks again for all your help.
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 17:41
Post #8


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



I'm just realizing that maybe I'm making this too hard. Basically all of these
CODE
[SPACE]Someword[SPACE]
should be capitalized.
All of these
CODE
[SPACE]someshortpreposition[SPACE]
should be lowercased unless it is preceeded or followed by any of these
CODE
()[]{}
Would this be easier?

This post has been edited by beaker: Jan 22 2006, 17:42
Go to the top of the page
 
+Quote Post
dano
post Jan 22 2006, 18:05
Post #9


Moderator


Group: Moderators
Posts: 4180
Joined: 4-September 03
From: Germany
Member No.: 201
Mp3tag Version: 2.46b



Next one:
CODE
\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)(?!\s[\(\[{])


--------------------
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 18:47
Post #10


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



QUOTE (dano @ Jan 22 2006, 17:05)
Next one:
CODE
\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)(?!\s[\(\[{])

*
I slightly modified it to account for ~)]}
CODE
\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)(?!\s[~\(\)\[\]{}])


I also modified the first part to use $caps2 which doesn't lowercase subsequent uppercase letters, so "McLachlan, Sarah" would be renamed to "Mclachlan, Sarah"

The only remaining problem I'm having is if a song/artist/whatever begins with one of these.
CODE
aps ~ Alternative Metal (2001) Tool [Lateralus{03}] Tool ~ The Patient
is being changed to
CODE
Aps ~ Alternative Metal (2001) Tool [Lateralus{03}] Tool ~ the Patient
I see the problem in Regulator, but I'm not sure how to add another conditional in regexp. Checking to make sure it's not ~()[]{}[SPACE] before the word would take care of the problem.

It's hard to belive how much logic these regexp can account for. All the tutorial's I've found online only say what each character means. IE: you need to know what "negative lookahead" means. Is there a site that you recommend?
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 20:06
Post #11


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



Ok. I found a good site that explains things a bit better for me. From what I've read, the following should work:
CODE
(?<![~\(\)\[\]{}])\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)(?!\s[~\(\)\[\]{}])
This does work in The Regulator. Unfortunately, I'm getting this error again
QUOTE
Regular expression: (?<![~\(\)\[\]{}])\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)(?!\s[~\(\)
Invalid preceding regular expression
sad.gif
I can get the check for ~ to work, but not the check for a list of characters sad.gif Any thoughts?

[EDIT]copy/paste error

This post has been edited by beaker: Jan 22 2006, 20:24
Go to the top of the page
 
+Quote Post
dano
post Jan 22 2006, 20:09
Post #12


Moderator


Group: Moderators
Posts: 4180
Joined: 4-September 03
From: Germany
Member No.: 201
Mp3tag Version: 2.46b



You could add (?<![~\(\)\[\]{}]) to the beginning of the regex, but I don't kow if it is supported in your version.

http://www.regular-expressions.info/ is a nice site.

lol

This post has been edited by dano: Jan 22 2006, 20:09


--------------------
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 20:26
Post #13


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



QUOTE (dano @ Jan 22 2006, 19:09)
You could add (?<![~\(\)\[\]{}]) to the beginning of the regex, but I don't kow if it is supported in your version.
*


Does it work in your version?
Go to the top of the page
 
+Quote Post
dano
post Jan 22 2006, 20:33
Post #14


Moderator


Group: Moderators
Posts: 4180
Joined: 4-September 03
From: Germany
Member No.: 201
Mp3tag Version: 2.46b



Yes it works. There was an upgrade in the engine some time ago (probably with the new unicode build)


--------------------
Go to the top of the page
 
+Quote Post
beaker
post Jan 22 2006, 20:52
Post #15


Member


Group: Full Members
Posts: 100
Joined: 6-September 03
From: USA
Member No.: 204
Mp3tag Version: 2.32b



QUOTE (dano @ Jan 22 2006, 19:33)
Yes it works. There was an upgrade in the engine some time ago (probably with the new unicode build)
*


It's the Unicode changes that are scaring me from the new versions. If I don't want to write Unicode (because of my hardware mp3 players) I need to tell it to write ASCII. Unfortunately, if I write ASCII I lose special characters like Æ in Ænema and such. I understand florian's need to implement it, lots of people were asking for it. I'm sure I'm in the minority on this issue, just letting you know my reasoning.

I've come up with a workaround, I stole a page from your book dano smile.gif I look for [SPACE]~ and replace it with ¤~ then I replace ~[SPACE] with ~¤. I do the same for all the separators (){}[]. Then I run this regexp:
CODE
\s(a|an|the|and|or|but|in|out|on|of|to|at)(?=\s)

This of course is the first one you posted (with the fix for my version of mp3tag). After all is said and done, I replace ¤ with a space.

Works like a champ, just more steps. w00t.gif

Thanks again for all your help, dano. biggrin.gif I appreciate your patience.
Go to the top of the page
 
+Quote Post

2 Pages V   1 2 >
Reply to this topicStart new topic
1 User(s) are reading this topic (1 Guests and 0 Anonymous Users)
0 Members:

 



RSS Lo-Fi Version Time is now: 4th September 2010 - 01:02