Help - Search - Members - Calendar
Full Version: Regular Expressions
Mp3tag Forums > Mp3tag - International > Support
Pages: 1, 2
Florian
This thread collects some useful regular expressions.

If you created a regular expression which solves a common task, please post it here and give a short description what it is supposed to do.

These regular expressions can be used with the action type Replace with regular expression. Please look at FAQ: How do I create a new action? to learn more about actions in Mp3tag.

-> If you have problems with a regular expression, please open a separate topic.
Florian
Trim leading/trailing spaces

See FAQ: Trim leading/trailing spaces.
areve
Regular expression: ^\s*[0-9]+\s*-\s*
Replace with:

This one will remove the track-number (if followed by a dash) and white space at the beginning of a string
(for instance 01 - Come Together will become Come Together)


Regular expression: ^\s*[0-9]+\s+
Replace with:

This one will remove the track-number if followed by a white space
(for instance 01 Come Together will become Come Together)
areve
Here is another one.
I don't think it will be useful as such to a lot of people, but with REGEXP, examples are never too many rolleyes.gif

My files were tagged as such:
name of the artist.2003AL-name of the album
(the 2003AL meaning: released in 2003, Album, Live, but it could be 1977A, or 1995IR)
Weird, I know tongue.gif
I wanted to change that to a simpler
2003-name of the album
Here is what I did:
Regular expression: .*\.([0-9]*).*-(.+)
Replace with: $1-$2
phoenixdarkdirk
This regular expression fixes tracknumbers from iTunes in the [track]/[numtracks] format (like 3/12).

Regex:
CODE
^(\d+)/\d+


Replace with:
CODE
$1


This can be applied to every file and it will only correct ones that have the / in them. Good luck!

According to RevRagnarok's suggestions below, I've made the expression less greedy.
areve
Another one, that switches first and last names (for instance, Jacques Brel will become Brel, Jacques):
Regular Expression: ^(.+)\s(.+)$
Replace with: $2, $1
mll
Used to use this basic regexp :

Applied on TRACK, find ^(\d)$ and replace by 0$1

It adds a leading 0 to the track number. I was thinking about enhancing it to apply it only to 1-digit numbers... when I discovered Ctrl+K smile.gif :

Cheers,

MLL

According to RevRagnarok's suggestions below, I've made the expression less greedy
ThurstonX
If you have a filename:
  • 01 Trackname
and want it to be
  • 01 - Trackname
create an Action with the following properties:
  • Field: _FILENAME
  • Regular Expression: ([0-9]+)\s
  • Replace matches with: $1 - $2

Edit: Topic merged by moderator.
RevRagnarok
I've noticed a lot of these RegExs 'go overboard' in their matching. A key to good expressions are limiting (1) the false matches and (2) the how long the engine needs to analyze the string.

For example, my two favorite expressions I have posted here are both above on this page. However, they have very greedy matchers that can easily result in lost data.

phoenixdarkdirk's - why NOT limit the track 'numbers' to digits with \d+ ?
mll's - really went overboard when a simple ^(\d)$ would've done it.

I highly recommend this book if you are serious about using REs. Of course, I recommend trying to find it at a technical library, because if you look it over a little you may realize you were just kidding and save yourself the money.

Of course, when we are using them on a handful of MP3 files, it's no big deal. wink.gif It's when you are handling megabytes of text files that it really matters.

Edit 5 Apr 2013 - After over eight years, fixed the URL to my blog.
Florian
RevRagnarok,

QUOTE (RevRagnarok @ Nov 18 2004, 01:23 PM)
I've noticed a lot of these RegExs 'go overboard' in their matching. A key to good expressions are limiting (1) the false matches and (2) the how long the engine needs to analyze the string.

Thanks a lot for your suggestions. I've changed the expressions you've mentioned to be less greedy.

Best regards,
~ Florian
nickless
This RegEx will convert abbreviations composed of single chars and points between (and/or behind it) to uppercase.
Single chars without points around remains lowercase, except if a "-"char and a space is before it. (Example: "Songname - A text")

CODE
RegEx: ( |\.|^)(\w)(?= |\.|$)
Replace with: $1$upper($2)
and
CODE
RegEx: ([^-])( \u )
Replace with: $1$lower($2)

Example:
CODE
a.b a Reg.eX in p.o.d p. diidi e.t. - a bad a.i
==>
A.B a Reg.eX in P.O.D P. diidi E.T. - A bad A.I

Both regular expressions should be executed one after another

Use
CODE
RegEx: ( \u )
Replace with: $lower($1)
instead of second RegEx to have chars after " - " lowercase too. (Example: "Songname - a text")

Regards
nickless

Edit: removed some unnecessary characters from RegEx
ThurstonX
Title Tag Conversion

This 2-step action does:

1. set Title tag to Filename
2. convert titles using, e.g.:
01 - Track Name
to
01. Track Name

First step:
Format Value
Field: TITLE
Formatstring: %_FILENAME%

Second step:
Set a new Replace with Regular Expression action
Field: TITLE
Regular expression:
^([0-9]+)\s*-\s*
Replace matches with:
$1. $2

I use the first format for all file names, but my iRiver displays the title tag, which looks better (and saves one character) using the latter. For me it's a 2-step process: Filename-to-Tag (%TITLE% only) and then convert Title tag. Assumes your filename is the way you want it, of course.
dano
Upper case for Roman numbers

Regular expression:
\b(?:M{0,3})(?:D?C{0,3}|C[DM])(?:L?X{0,3}|X[LC])(?:V?I{0,3}|I[VX])(?=(\.\s|\s|\)|$))
Replace matches with:
$upper($0)
[ ] case-sensitive comparison
Snykch
Convert artist names in the form 'Artist, The'

Regular expression: (.*),\sthe$
Replace with: The $1
Michaelm
And the other way around

Move "The " to the end

Regular expression: ^The (.+)
Replace with: $1, The


So "The Cure" becomes "Cure, The"
And "The a whole lot of words" becomes "a whole lot of words, The"
Fan o blues
It's taken me far too long to figure it out but I've finally got it working.

To change Horne, Lena to Lena Horne or
Hawkins, Coleman With Manny Albam & His Orchestra to Coleman Hawkins With Manny Albam & His Orchestra
Field: ARTIST
Regular Expression: ^([\w]+),\s([\w]+)

Replace Matches with: $2 $1

I hope I didn't miss the answer in here somewhere. If there's an easier or better way please feel free to help.
neilpa
This regex can remove/replace all but the last dot from a string. It's useful for people like me who prefer to limit filename chars to a-z, 0-9, '_', and '-'. We need to save that last dot for file extensions.

Regex: [.](?![\w]{2,4}$)

Assumes file extensions are 2-4 chars long (not including the dot). If you are curious how this works google regex lookaround.
Nathan
I recently converted all my wma files to mp3 only to realize that all the tags were lost. My music is in the form:

...\artist\album\track title.mp3

MP3TAG came to my rescue! Not sure whether there was an easier way to do this, but I got started by reading this. Sorry if these are lame ways of doing it, but it's my 1st experience with regexp.

format artist:
$regexp(%_folderpath%,.+\\(.+)\\(.+)\\,$1)

format album:
$regexp(%_folderpath%,.+\\(.+)\\(.+)\\,$2)

Next 2 I got from ThurstonX in this post.

format track:
$regexp(%_filename%,(\d*)\s?(.+),$1)

format title:
$regexp(%_filename%,(\d*)\s?(.+),$2)

This program saved me TONS of time...
Harakiri
Another member helped me creating one of the most useful REGEXP that i use so far:

From this:

Röyksopp - Beautiful DAY withOUT yoU (Rex The Dog Remix)

To this:

Röyksopp - Beautiful day without you (Rex The Dog Remix)

USING:

Action type: Replace with regular expression
Field: TITLE
Regular expression: ^(.*?)(\(|$)
Replace matches with: $caps3($1)$2
[ ] case-sensitive comparison


See the original thread here
KCE
Really simple but this will remove any parenthesis and its contents in any part of the string

Example:
Blah (blah1) (blah 2)
Result:
Blah

Regular expression: \(.+?\)

Replace matches with:
captainmidnight
I dislike it when a leading zero appears in the track number of the track tag (or when the total track count is added after a '/' char).

[Aside: It seems like a bad idea to pollute a nice simple tag like the track with extra information like this. My intuition is that the information in each tag should be kept as simple and pure as possible. In databases there is a similar concept: you usually want to achieve what is called a normalized design. But if someone has a good reason for doing otherwise, please educate me!

Note that I only have a problem with leading zeroes in the track TAGS. In contrast, when I use tracks in filenames, I DO like a leading zero if there are > 9 tracks on the CD because that causes the filenames to be lexicographically ordered, which is critical for proper sorting by your file system, as well as it visually looks better when filenames are displayed in a list.]

Earlier in this thread, phoenixdarkdirk pointed out how to remove any '/' char and following digits.

Here is how to remove any leading zeroes (as well as trimmable whitespace):

Regex:
CODE
\s*0(\d+)\s*


Replace with:
CODE
$1
captainmidnight
Actually, here is a single regex that does everything that I want:

Regex:
CODE
\s*0?(\d+)(\s*/\s*\d+)?\s*


Replace with:
CODE
$1

[/quote]

This will remove any trimable whitespace around the number (which will be the track tag in my case), remove any leading zero from the number, and remove any suffix after the number that starts with a '/' char (e.g. the total track count).
Craig
QUOTE (Fan o blues @ Mar 25 2007, 19:05) *
It's taken me far too long to figure it out but I've finally got it working.

To change Horne, Lena to Lena Horne or
Hawkins, Coleman With Manny Albam & His Orchestra to Coleman Hawkins With Manny Albam & His Orchestra
Field: ARTIST
Regular Expression: ^([\w]+),\s([\w]+)

Replace Matches with: $2 $1

I hope I didn't miss the answer in here somewhere. If there's an easier or better way please feel free to help.



I've just spent a while figuring this out myself, and came on here to post it only to find someone else already has smile.gif
Mine is quite similar
Filed: ARTIST
RegExp: ^(.+),\s+(\w*)
Replace With: $2 $1

Seems to work the same
Lebon14
QUOTE (KCE @ Oct 24 2008, 17:08) *
Really simple but this will remove any parenthesis and its contents in any part of the string

Example:
Blah (blah1) (blah 2)
Result:
Blah

Regular expression: \(.+?\)

Replace matches with:


Thank you so much for this! You saved me a LOT of time! biggrin.gif
Florian
I've split some off-topic posts to Problem with Actions and Regular exp​ressions.
MatthiasM
This is not a regular expression for mp3tag, but a link to a page which contains lots of information about creating and using regular expressions.

In my oppinion everybody who uses regexp should have read it wink.gif
www.regular-expressions.info

kind regards,
Matthias
incifinci
Switches first and last names

QUOTE (areve @ May 30 2004, 21:20) *
Another one, that switches first and last names (for instance, Jacques Brel will become Brel, Jacques):
Regular Expression: ^(.+)\s(.+)$
Replace with: $2, $1

Thank you!
I developed it a little.

Variant A
It will not change those artists, where the conversion already done (artist has comma).
Example: Brel, Jacques William remains Brel, Jacques William

Action
: regular expression
Field: ARTIST
Regular expression: ^([^,]+)\s([^,]+)$
Replace with: $2, $1


Variant B

It handles multipled for WMP artists (in format "artist1/artist2/artist3"), possible nicknames at the end in square brackets and trims unnecessary white spaces.
Example:
Elvis Presley /Brel, Jacques/ Gabriele Susanne Kerner [Nena ]
will
Presley, Elvis/Brel, Jacques/Kerner, Gabriele Susanne [Nena]

Action group
1. action
: regular expression (trim white spaces)
Field: ARTIST
Regular expression: ^\s+|\s+$
Replace with: (nothing)

2. action: regular expression (multiple white spaces replace to 1)
Field: ARTIST
Regular expression: \s{2,}
Replace with: " " (whitout quotation marks)

3. action: replace (delete white space after opening bracket)
Field: ARTIST
Replace: "[ " (whitout quotation marks)
Replace with: [

4. action: replace (delete white space before closing bracket)
Field: ARTIST
Replace: " ]" (whitout quotation marks)
Replace with: ]

5. action: split fields by separator (for the next actions)
Field: ARTIST
Separator: /

6. action: regular expression (switches names without nickname)
Field: ARTIST
Regular expression: ^([^,[]+)\s([^,[]+)$
Replace with: $2, $1

7. action: regular expression (switches names with nickname)
Field: ARTIST
Regular expression: ^([^,[]+)\s([^,[]+)\s(\[.+\])$
Replace with: $2, $1 $3

8. action: merge duplicate fields (...back)
Field: ARTIST
Separator: /
Zoofield
These are my Reg-Expressions there may be duplicates to what have previously been posted, or there are small changes to fix the many errors that the posted Reg-Expressions create. I have done a lot of testing to make sure these do what they say they do and nothing else. I hope this also allows me a place to create a back up.
Thank-you! Updated: (June/10/2011) Mp3tag 2.49

Action Group:Aa
CODE
RE: Remove track number from title, ex."1 - Title" < "Title".
Field:TITLE
re:^\s*\d+\s*-\s*
Nothing:

RE: Remove "The" from artist, ex."The Artist" < "Artist".
Field:ARTIST
re:^The\s+
Nothing:

Case conversion:
Field:_TAG
Case conversion:Mixed Case
/[({"-_

RE:Capitalize Roman Numerals up to 399. (trimmed down from 3999
RE:to reduce false positives) - http://bit.ly/lZdZsj
Field:_Tag
re:(?<!')\b(?=[CLXVI])((C{0,3})?((X[LC])|(L?X{0,3})|L)?((I[VX])|(V?(I{0,3}))|V)?)\b
$upper($0)

RE: Capitalize Zero stop Acronyms and Initialisms. - http://bit.ly/jADbI6
Field:_Tag
re:(?sad.gif?<=[^\w\']|\_)|(?<=^))(ac|ad|afi|aol|asap|atm|bbc|bc|bce|blt|btw|cc|cia|crc|cst|csv|dc|dfa|dj|d
mv|doa|dst|eod|ep|est|et|faq|fbi|fm|gi|glc|gmo|imo|imho|iq|ira|jc|irs|krs|la|lp|
m
c|mst|mtd|nasa|oj|pc|pi|pj|pm|ps|qed|rv|sos|ssr|usa|ussr|tba|tbd|teotwawki|tlc|t
v
|ufo)(?=[^\w\']|\_|$)(\.*)
$upper($1)

RE: Capitalize the Letter Infront of a period. A.B.C.
Field:_Tag
re:(?<=\.)([^\W\d\_])
$upper($1)

RE: Capitalize the Letter Infront of a Space and Apostrophe.
Field:_Tag
re:(?<=\s')([^\W\d\_])
$upper($1)

RE: Lower Case Prepositions, Articles and, Coordinating Conjunctions.
Field:_Tag
re:(?<=\w\s)(a|as|at|an|about|above|across|after|against|along|alongside|although|amo
ng|and|around|as|at|because|before|behind|below|beneath|beside|between|beyond|bu
t
|by|de|despite|down|during|even|except|excepting|for|from|if|in|inside|into|like
|
near|next|nor|of|off|on|onto|or|out|outside|over|past|regarding|round|since|so|t
h
an|the|through|throughout|till|to|toward|under|underneath|unlike|until|up|upon|v
o
n|when|while|with|within|without|yet)(?=\s\w)
$lower($0)

RE: Lower Case Abbreviations, Add Stop.
Field:_Tag
re:(?<=[^\w\']|\_)(alt|ave|capt|cent|corp|div|ed|eg|etc|fag|feat|gen|hr|ie|inc|inst|lb|ltd|
min|mt|op|pl|pop|pseud|pt|pub|rev|sec|ser|sgt|st|univ|vs|vol)(?=[^\w\']|\_)(\.*)
$lower($1).

RE: Add Space Before, & ( { [ + =
Field:_Tag
re:([^\W\_])([&\(\{\[\+\=])
$1 $2

RE:Add Space After, & ) } ] ; : , ! + =
Field:_Tag
re:([&\)\}\]\;\:\,\!\+\=])([^\W\_])
$1 $2

RE: Add Space After, . http://bit.ly/jADbI6
Field:_Tag
re:(?<!^)(?<!\d|\s|\.)(\.)([^\W\d\_])(?!\.|\s|$)
$1 $upper($2)

RE: Add Space After, " "
Field:_Tag
re:(".*?")([^\W\_])(?!$)
$1 $2

RE: Add Space Before, " "
Field:_Tag
re:([^\W\_])(".*?")
$1 $2

RE: Remove Spaces After, ( [ { Before, ] } ) ? : ; , ! .
Field:_Tag
re:([\(\[\{])\s+|\s+([\]\}\)\?\:\;\,\!\.])
$1$2

RE: Remove Spaces Before / After.
Field:_Tag
re:\s+(\/)\s+
$1

RE: Remove Spaces inside, " "
Field:_Tag
re:"\s*(.*?)\s*"
"$1"

RE: Remove Spaces Before and After String.
Field:_Tag
re:^\s+|\s+$
Nothing:

RE: Remove all Double+ Spacing.
Field:_Tag
re:\s{2,}
One Space:

RE: Add Apostrophe to Are Contractions.
Field:_Tag
re:\b(how|they|what|when|where|why|you)re(?=[^\w\']|\_|\$)
$1're

RE: Add Apostrophe to Had/Would Contractions.
Field:_Tag
re:\b(he|how|i|it|she|they|we|what|where|who|why|you)d(?=[^\w\']|\_|\$)
$1'd

RE: Add Apostrophe to Have Contractions.
Field:_Tag
re:\b(could|how|i|might|must|should|we|what|when|where|would|you)ve(?=[^\w\']|\_|\$)
$1've

RE: Add Apostrophe to Is Contractions.
Field:_Tag
re:\b(he|here|how|it|let|she|that|there|two|what|when|where|who|why)s(?=[^\w\']|\_|\$)
$1's

RE: Add Apostrophe to Not Contractions.
Field:_Tag
re:\b(ain|aren|can|couldn|didn|doesn|don|hadn|hasn|haven|isn|mightn|mustn|shoul
dn|wasn|weren|won|wouldn)t(?=[^\w\']|\_|\$)
$1't

RE: Add Apostrophe to Will Contractions.
Field:_Tag
re:\b(how|i|it|she|that|there|they|what|when|where|who|why|you)ll(?=[^\w\']|\_|\$)
$1'll

RE: Add Apostrophe to Am Contractions.
Field:_Tag
re:\b(i)m(?=[^\w\']|\_|\$)
$1'm

RE: Add Apostrophe to Do Ya Contraction.
Field:_Tag
re:\b(D)ya(?=[^\w\']|\_|\$)
$1'ya

RE: Add Apostrophe to Do You Contraction.
Field:_Tag
re:\b(D)you(?=[^\w\']|\_|\$)
$1'you

RE: CamelCase Mc Words.
Field:_Tag
re:\bMc(?=.)
Mc$upper($1)

RE:CamelCase O' Words.
Field:_Tag
re:\bO'([^\W\d\_e])
O'$upper($1)
d003232
My title and/or filename looks like:

Gospel&spiritual 1 - Track 03.[128kb 44khz 2'34]

I want to delete the suffix: '.[128kb 44khz 2'34]'.

I do it the way:

Field: TITLE
RegExp: (.*)\.\[[0-9]+kb\s[0-9]+khz.*\]
Replace with: $1


Just change that to the field _FILENAME if you want to do it for the filename.

If you want to take a more easy solution (for example delete everything after the first dot, ...) see the next posts from DetlevD or RevRagnarok.

Update: Added a '\' befor the dot, to make sure, only a real dot before the '[' will be replaced, thx to RevRagnarok for the hint.
DetlevD
QUOTE (d003232 @ Aug 16 2010, 12:38) *
...Gospel&spiritual 1 - Track 03.[128kb 44khz 2'34] ... I want to delete the suffix: '.[128kb 44khz 2'34]' ...

This is a good example for the situation, where someone does not really need to learn the regular expression language, but instead use a simple action.

Actiontype 7: Import tag fields (guess values)
Source format: %TITLE%
Guessing pattern: %TITLE%.%DUMMY%
From:
Gospel&spiritual 1 - Track 03.[128kb 44khz 2'34]
To:
Gospel&spiritual 1 - Track 03

DD.20100816.1822.CEST
RevRagnarok
QUOTE (d003232 @ Aug 16 2010, 06:38) *
I do it the way:

Field: TITLE
RegExp: (.*).\[[0-9]+kb\s[0-9]+khz.*\]
Replace with: $1

FYI, that is slightly incorrect. It will actually remove 1 character from before the '[' which in your example is '.' - you need to escape the '.' to ensure it is a '.':
Field: TITLE
RegExp: (.*)\.\[[0-9]+kb\s[0-9]+khz.*\]
Replace with: $1

Another simpler option would be to just grab everything before the first dot:
Field: TITLE
RegExp: ([^\.]*)
Replace with: $1

Or if you want everything up to the first '[' but no period, why bother matching what is in the brackets unless you planned on parsing it for something else?

Field: TITLE
RegExp: (.*)\.\[
Replace with: $1


I am at work, so this is all untested RE code.
DetlevD
Roman numerals in uppercase

This "Regular Expression" changes Roman numerals in uppercase.
Valid range of numbers: "I" to "MMMCMXCIX" (decimal: 1-3999).

Dieser "Reguläre Ausdruck" ändert römische Ziffern in Großbuchstaben.
Gültiger Zahlenbereich: "I" bis "MMMCMXCIX" (dezimal: 1-3999).

Example
From:
"ab i ab ii ab iii iv v vv vi vii viii ix abc x mcmliv ll cmm mmix-ix-xi"
To:
"ab I ab II ab III IV V vv VI VII VIII IX abc X MCMLIV ll cmm MMIX-IX-XI"

CODE
$regexp(%TITLE%,'\b(?i:(?=[MDCLXVI])((M{0,3})((C[DM])|(D?C{0,3}))?((X[LC])|(L?X{0,3})|L)?((I[VX])|(V?(I{0,3}))|V)?))\b','\U$0')



Alternative:
(using the 'ignore case' parameter of the Mp3tag $regexp function instead of regex modifier)
CODE
$regexp(%TITLE%,'\b(?=[MDCLXVI])((M{0,3})((C[DM])|(D?C{0,3}))?((X[LC])|(L?X{0,3})|L)?((I[VX])|(V?(I{0,3}))|V)?)\b','\U$0',1)



Attached is a Mp3tag mte export script, which visualizes the results of three attempts using regular epressions, which are able to upcase Roman Numerals in different quality.
Click to view attachment

DD.20100831.1133.CEST
Edit. Spelling error in RegEx corrected and zip file attached.
DD.20110320.1518.CET
DetlevD
Splitting an "Upper Camel Case" string

The following "Regular Expression" splits an "Upper Camel Case" string into components by inserting a space character before any Word which starts with a capital letter or digit.

Der folgende "Reguläre Ausdruck" teilt eine Zeichenkette mit Binnenmajuskeln in Komponenten auf durch Einfügen eines Leerzeichens vor jedem Wort, das mit einem Großbuchstaben oder einer Ziffer beginnt.

Example
From:
"ThisIsThe2ndSongFromD.D.'sFirstAlbum30YearsAgo."
To:
"This Is The 2nd Song From D.D.'s First Album 30 Years Ago."

CODE
$regexp(%_FILENAME%,'(?<!^)(\u\l|(?<=\l)[\u\d])',' $1')


DD.20100917.1902.CEST

Click to view attachment
Edit.DD.20110816.1848.CEST
InspectorMustache
QUOTE (nickless @ Mar 29 2005, 22:22) *
This RegEx will convert abbreviations composed of single chars and points between (and/or behind it) to uppercase.
Single chars without points around remains lowercase, except if a "-"char and a space is before it. (Example: "Songname - A text")

This didn't work for me. I tried to make my own and came up with this simple RegEx to get the cases right in my tags:

REGULAR EXPRESSION:
\b(?<!')(\w)
REPLACE WITH:
\u\1

This works for me. You can apply this to the _ALL field but be aware that this capitalizes your extensions too. If this looks somehow annoying to you (as it does to me), simply apply this to the _FILENAME field afterwards:

REGULAR EXPRESSION:
\.([^\.]+)$
REPLACE WITH:
.\L\1\E

These are both pretty simple but I spent quite a time figuring out how to make these apply for Unicode. I finally found out that Mp3tag has Unicode functionality standardly implemented into its RegEx engine. Heh. biggrin.gif


Edit: Here's some improvement. The last one only replaced the first letter of the word and put it into upper case disregarding the rest of the word. This one also puts those that follow the first letter into lower case ("DAItro" becomes "Daitro"):

REGULAR EXPRESSION:
\b(?<')([a-zA-Z])([^']*?)\b
REPLACE WITH:
\u\1\L\2\E

This next one is basically the same but takes note of the French article L' and puts the letter following the article into upper case. So, for example, "L'eau" becomes "L'Eau".

REGULAR EXPRESSION:
\b(?<!(?<!\s[Ll])')([a-zA-Z])([^']*?)\b
REPLACE WITH:
\u\1\L\2\E
Juozas V
Script to fix capitalization according to English rules only in tags. This is my n-th attempt to do this, but I think it works quite nice... Of course some human work is needed because it doesn't contain corpus to check it according to its POS.

CODE
[#0]
T=1
F=_TAG
1=1
2=

[#1]
T=4
F=_TAG
1=\\b(A|An|The|And|But|Or|So|After|Before|Out|When|While|Since|Until|Although|Even If|Because|About|Above|Across|Against|Along|Alongside|As|At|Below|By|During|For|
From|In|Into|Of|Off|On|Onto|Over|Than|Through|Till|To|Under|Up|With|Within|Witho
u
t)\\b
2=$lower($1)
3=0

[#2]
T=4
F=_TAG
1=^\\s*(\\w+)
2=$caps($1)
3=0

[#3]
T=4
F=_TAG
1=(\\w+)\\s*$
2=$caps($1)
3=0


Or directly in mp3tag:
Case conversion
Field _TAG
Case conversion Mixed Case
Words begin...

Replace with regular expression
Field _TAG
Regular expression \b(A|An|The|And|But|Or|So|After|Before|Out|When|While|Since|Until|Although|E
ven If|Because|About|Above|Across|Against|Along|Alongside|As|At|Below|By|During|For|
From|In|Into|Of|Off|On|Onto|Over|Than|Through|Till|To|Under|Up|With|Within|Withou
t)\b
Replace matches with $lower($1)

Replace with regular expression
Field _TAG
Regular expression ^\s*(\w+)
Replace matches with $caps($1)

Replace with regular expression
Field _TAG
Regular expression (\w+)\s*$
Replace matches with $caps($1)
DetlevD
Regular Expression Tutorial

Beginner or Professional!
Please take a few minutes and look this presentation (slideshow or PDF):

Andrei’s Regex Clinic

This is an outstanding work, which visualises the world of Regular Expressions in a wide manner.
The tutorial can help to open your mind.

http://zmievski.org/c/dl.php?file=talks/co...egex-clinic.pdf
http://www.slideshare.net/andreizm/andreis-regex-clinic
http://zmievski.org/2010/05/regex-clinic-on-slideshare

DD.20110110.1912.CET
pone
ZITAT(DetlevD @ Jan 10 2011, 19:09) *
Regular Expression Tutorial

Beginner or Professional!
Please take a few minutes and look this presentation (slideshow or PDF):

Andrei’s Regex Clinic

Are all the things described there useable for mp3tag?
DetlevD
QUOTE (pone @ Jan 10 2011, 21:02) *
Are all the things described there useable for mp3tag?

Probably not, one must abstract.

DD.20110110.2111.CET
Doug Mackie
To Juozas V:
Thank you very much for this. It does the best English title case conversion that I have seen here.

However, I have one minor quibble. Your word list includes some words and phrases that I see more often used in song titles as subordinate conjunctions than as prepositions. For example "after", "because", and "although". From my reading on capitalization in titles, subordinate conjunctions should always be capitalized. Since it's not practical to use script to detect how a word is used, I chose to remove those words from your Reg Ex on the basis that there should be fewer errors without them than with them. The words that I removed are:

After, As, Although, Because, Even If, Since, Till, Until, When, and While.

Here is my revised word list (resorted alphabetically):

A|About|Above|Across|Against|Along|Alongside|An|And|As|At|Before|Below|But|By|Du
ring|For|From|In|Into|Nor|Of|Off|On|Onto|Or|Out|Over|So|Than|The|Through|To|Unde
r
|Up|With|Within|Without


Note that I also added the coordinating conjunction "Nor" to your list.

Best regards,
Doug M. in NJ
LosMintos
Convert MP3tag's date to ISO date

Converts (example) 18.04.2011 to 2011-04-18.

format tag field
Field: date added
Format string: %_date%

replace with regular expression
Field: date added
RegExp: ^(\d+)\.(\d+)\.(\d+)$
Replace with: $3-$2-$1
Zoofield
I could not get, "DetlevD - Splitting an Upper Camel Case string" to work, although I like the idea!

These together:
CODE
RE:Adds a space between Capital and lowercase letter or digit behind it.
Field:_Tag
re:([^A-Z\W\_])([A-Z])(?=[^A-Z])
($1) ($2)

RE:Adds a space between Digit and lower case letter behind it
Field:_Tag
re:([^\W\d\_])(\d)
($1) ($2)

Will...

Example.
From:
"ThisIsThe2ndSongFromD.D.'sFirstAlbum30YearsAgo."
To:
"This Is The 2nd Song From D.D.'s First Album 30 Years Ago."


This is the two above combined:
CODE
RE:Adds a space between, Capital and lowercase letter or digit behind it, Digit and lower case letter behind it.
Field:_Tag
re:([^A-Z\W\_])([A-Z])(?=[^A-Z])|([^\W\d\_])(\d)
$1$3 $2$4

Does the same except in the case of...

CapitalWord9CapitalWord2ndSong30Years

Example.
First pass will: Capital Word9 Capital Word 2nd Song 30 Years
Second pass will: Capital Word 9 Capital Word 2nd Song 30 Years
Third pass will: Reveal A Latent O.C. Disorder

Because the single digit '9' in the example can only be captured once per pass per replacement.

I would use the first set for completeness and through the "Action Groups".
I would use the Second for brevity and through "Actions (Quick)"
tobi06
To remove any website in the filename


example: artist - title[www.whateverwebsite.com].mp3
artist - title.mp3

regular expression: \[.{3}\.\w*\..{3}\]
gingernob
QUOTE (areve @ Oct 16 2003, 21:42) *
Regular expression: ^s*[0-9]+s*-s*
Replace with:
This one will remove the track-number (if followed by a dash) and white space at the beginning of a string
(for instance 01 - Come Together will become Come Together)



OK, i know im being dumb, just this reg ex stuff is over my head..u might just as well speak japanese to me.
All the examples i find talk of removing ## - [track title] to [track title]
i just want ## [track title] to [track title]...no dash.
appreciate a 'simple' answer for simpleton.
dano
I've added an example without dash to that post.
gingernob
QUOTE (dano @ Jun 19 2011, 21:14) *
I've added an example without dash to that post.

Thank u so much...
DetlevD
Convert from RFC822/1123 Date string to ISO-8601 Date string.

Examples:
RFC822/RFC 1123 ==> ISO-8601
1 Feb 2009 ==> 2009-02-01
30 Sep 2010 ==> 2010-09-30

$replace($regexp($right('0'%YEAR%,11),
'(\d{1,2}) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (\d{2,4})','$3-$2-$1'),
'Jan','01','Feb','02','Mar','03','Apr','04','May','05','Jun','06',
'Jul','07','Aug','08','Sep','09','Oct','10','Nov','11','Dec','12')


See also:
http://www.w3.org/Protocols/rfc822/
http://www.freesoft.org/CIE/RFC/1123/99.htm

DD.20110801.0623.CEST


Example:
1 Feb 2009 HH:MM ==> 2009-02-01 HH:MM

$replace($regexp($right('0'$cutRight(%YEAR%,6),11),
'(\d{1,2}) (Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) (\d{2,4})','$3-$2-$1'),
'Jan','01','Feb','02','Mar','03','Apr','04','May','05','Jun','06',
'Jul','07','Aug','08','Sep','09','Oct','10','Nov','11','Dec','12')$right(%YEAR%,6)


DD.20140310.2236.CET
DetlevD
How to copy a list of artists and their roles ...
... from tag-field COMMENT
... to tag-field INVOLVEDPEOPLE


Example 1
From: COMMENT
Person1:Role1
Person2:Role2
Person3:Role3

To: INVOLVEDPEOPLE
Role1:Person1;Role2:Person2;Role3:Person3;

Action: Format value
Field: INVOLVEDPEOPLE
Formatstring:
$regexp(%COMMENT%$char(13),'(.+?):(.+?)[\r\n]+','$2:$1;')


Example 2
From: COMMENT
Person1:Role1
Person2: Role2a,Role2b
Person3 : Role3a, Role3b
Person4: Role4a & Role4b, Role4c

To: INVOLVEDPEOPLE
Role1:Person1;Role2a,Role2b:Person2;Role3a,Role3b:Person3;Role4a & Role4b,Role4c:Person4;

Action: Format value
Field: INVOLVEDPEOPLE
Formatstring:
$regexp($regexp(%COMMENT%$char(13),'(.+?)\s*:\s*(.+?)[\r\n]+','$2:$1;'),'\s*,\s*',',')

DD.20110824.1757.CEST
pone
ZITAT(DetlevD @ Aug 24 2011, 16:55) *
$regexp(%COMMENT%$char(13),'(.+?) :( .+?)[\r\n]+','$2:$1;')
...
$regexp($regexp(%COMMENT%$char(13),'(.+?)\s*:\s*(.+?)[\r\n]+','$2:$1;'),'\s*,\s*',',')


What is $char(13) ? And why is it needed here?
DetlevD
QUOTE (pone @ Aug 24 2011, 17:29) *
What is $char(13) ? And why is it needed here?

$char(13) is the "CarriageReturn" control character.
It is appended here on the fly to the COMMENT string as a helper, just to make sure, that there is at least one "CarriageReturn" character at the end of the COMMENT string, in order to let the RegExp work correctly, even for the case, when the original COMMENT string has no trailing CarriageReturn/LineFeed sequence.

DD.20110824.1750.CEST
naisanza
If you want to add TEXT to the existing tag:

Regular expression: (.*)
Replace with: TEXT \1 \0

I found that if you don't put in the \0 at the end it will repeat TEXT twice. For example:

Title: Texty text
Becomes: TEXT Texty textTEXT


I'm using v2.48. This might of been fixed in the later versions, but it's the version I've been using and some things I refuse to update to a newer version since I've never had any other issues with it except for this one bit of annoyance.
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2014 Invision Power Services, Inc.