VoxCommando

Help and Support (Using VoxCommando) => VoxCommando Basics and Core Features => Topic started by: xtermin8r on May 22, 2014, 12:18:21 PM

Title: Regex help to extract artist, song from web page
Post by: xtermin8r on May 22, 2014, 12:18:21 PM
Dear Voxinator

I'm struggling to construct the regex pattern that extracts the artist and song name for another mpc xperiment I'm doing.
the url in question http://localhost:13579/info.html
Code: [Select]
<p id="mpchc_np">&laquo; MPC-HC v1.7.3.0 &bull; Snap - The Power &bull; 00:00:00/00:04:24 &bull; 19.4 MB &raquo;</p>

I'm trying to extract Snap as artist and The Power as song, so that I can ask Jarvis to tell me who the artist is, and tell me the song name.

the only thing I could come up with is &bull;(.*?)(.*?)&bull which gives me Snap - The Power

thanks in advance.
Title: Re: Regex help to extract artist, song from web page
Post by: nime5ter on May 22, 2014, 12:55:11 PM
How consistent is the pattern?

Does
Code: [Select]
<p id="mpchc_np">&laquo; MPC-HC v1.7.3.0 &bull; always precede the song name?

Is the artist name always separated from the song name by " - "?

Does a bullet ("&bull;") always appear after the band name?
Title: Re: Regex help to extract artist, song from web page
Post by: nime5ter on May 22, 2014, 12:59:46 PM
e.g. the following gets the info you want, but you may run into problems depending on pattern variability


Code: [Select]
<?xml version="1.0" encoding="utf-16"?>
<!--VoxCommando 1.9.5.1-->
<command id="1151" name="get song info" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="">
  <action>
    <cmdType>Results.SetLastResult</cmdType>
    <params>
      <param>&lt;p id="mpchc_np"&gt;&amp;laquo; MPC-HC v1.7.3.0 &amp;bull; Snap - The Power &amp;bull; 00:00:00/00:04:24 &amp;bull; 19.4 MB &amp;raquo;&lt;/p&gt;</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>Results.RegEx</cmdType>
    <params>
      <param>&amp;bull;\s(.*?).-.(.*?)&amp;bull;</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>OSD.ShowText</cmdType>
    <params>
      <param>Artist: {Match.1.1}</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>OSD.AddText</cmdType>
    <params>
      <param>Song: {Match.1.2}</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
</command>

In the above, {Match.{i}.1} will always be the artist, and {Match.{i}.2} will be the song, if the pattern is consistent throughout.

[edited to correct which is artist and which is song.  :biglaugh]
Title: Re: Regex help to extract artist, song from web page
Post by: xtermin8r on May 22, 2014, 01:01:21 PM
How consistent is the pattern?

Does
Code: [Select]
<p id="mpchc_np">&laquo; MPC-HC v1.7.3.0 &bull; always precede the song name?
yes

Quote
Is the artist name always separated from the song name by " - "?
not always

Quote
Does a bullet ("&bull;") always appear after the band name?
yes always after the full artist song name
Title: Re: Regex help to extract artist, song from web page
Post by: nime5ter on May 22, 2014, 01:04:50 PM
not always

That will be a problem. Are there a set number of possibilities? Is it a public web page that I can look at?
Title: Re: Regex help to extract artist, song from web page
Post by: xtermin8r on May 22, 2014, 01:05:51 PM
Thank you nime5ter. Its exactly what I needed, if artist name and song are not separated by a - I guess i will have to manually insert it into the file name.
Title: Re: Regex help to extract artist, song from web page
Post by: nime5ter on May 22, 2014, 01:08:09 PM
Yeah, I was just about to say, if it's your own library, then the best thing would be if you could rename the files in a consistent way.
Title: Re: Regex help to extract artist, song from web page
Post by: xtermin8r on May 22, 2014, 01:09:25 PM
That will be a problem. Are there a set number of possibilities? Is it a public web page that I can look at?

it's not a public web page, it's the web page of media player classic, i could upload the html if you like
Title: Re: Regex help to extract artist, song from web page
Post by: nime5ter on May 22, 2014, 01:14:57 PM
If you like, sure. But it seems like it would make more sense, and be more reliable, if you could create consistent file names and then work with that.

The above obviously won't work perfectly if you have file names with more than one " - " in the name.

Title: Re: Regex help to extract artist, song from web page
Post by: xtermin8r on May 22, 2014, 01:26:35 PM
If you like, sure. But it seems like it would make more sense, and be more reliable, if you could create consistent file names and then work with that.
Personally i think there is no need to upload it (waste of digital space)  :biglaugh, the relevent info is in the first post.
I agree, it would be more reliable if file names are consistent and include a "-" between the artist and song name.

Quote
The above obviously won't work perfectly if you have file names with more than one " - " in the name.
True, I will have to use other methods to get rid of any extra "-" in the name.




Title: Re: Regex help to extract artist, song from web page
Post by: nime5ter on May 22, 2014, 01:34:44 PM
True, I will have to use other methods to get rid of any extra "-" in the name.

Or use a more unique character to separate song and artist in your file names.
Title: Re: Regex help to extract artist, song from web page
Post by: jitterjames on May 23, 2014, 10:05:51 AM
Looks like it is just returning the filename.

I recommend you use a proper music management program like MediaMonkey.  It is free and can't be beat. Even if you don't want to use it to play your music for some reason, you can still use it to organise your music, maintain proper tags, album art etc. And it can then organise your files with predictable path and filename formats.