VoxCommando

Help and Support (Using VoxCommando) => Other Plugins => Topic started by: oswalsh on July 03, 2015, 10:24:27 AM

Title: RoboB - GetElementByClassName
Post by: oswalsh on July 03, 2015, 10:24:27 AM
Hey,

I just started messing around with Robo Browser, and its awesome! However I'm trying to get it to read the news, and on the CBC, the news is contained in <div class="story-content">. Is there a way to get an element by class name?

I'm not sure how to use regex to match a single class name, when I try things like (?:^|\W)story-content(?:$|\W), I get all the text from the page, though I would think this would be the way to do it.
Title: Re: RoboB - GetElementByClassName
Post by: nime5ter on July 03, 2015, 10:52:02 AM
I can't check this without knowing which specific web page you're using (cbc.ca/news didn't seem to have any divs with that class name) but I think you should be able to use the RoboB.ElementRegex (http://voxcommando.com/mediawiki/index.php?title=Plugin_RoboB#ElementRegex) action, like so:

Code: [Select]
<?xml version="1.0" encoding="utf-16"?>
<!--VoxCommando 2.1.5.0-->
<command id="688" name="Get story content class content" enabled="true" alwaysOn="False" confirm="False" requiredConfidence="0" loop="False" loopDelay="0" loopMax="0" description="">
  <action>
    <cmdType>RoboB.Select</cmdType>
    <params>
      <param>cbc</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>RoboB.Navigate</cmdType>
    <params>
      <param>YOUR URL</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>RoboB.Wait</cmdType>
    <params />
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>RoboB.Show</cmdType>
    <params />
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>RoboB.ElementRegex</cmdType>
    <params>
      <param>div</param>
      <param>class="story-content"</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>RoboB.GetHTML</cmdType>
    <params />
    <cmdRepeat>1</cmdRepeat>
  </action>
  <action>
    <cmdType>RegExTool.Open</cmdType>
    <params>
      <param>True</param>
    </params>
    <cmdRepeat>1</cmdRepeat>
  </action>
  <phrase>Get story content class content</phrase>
</command>
Title: Re: RoboB - GetElementByClassName
Post by: jitterjames on July 03, 2015, 12:09:00 PM
@Nime5ter that probably won't quite work because it will select the first parent div that contains this string anywhere in it.  Usually we need to specify the regex pattern for the beginning of the div's string using ^ or match at the end of the string using $.

So the pattern should probably look something like this:

Code: [Select]
^<div\sclass="story-content">
Title: Re: RoboB - GetElementByClassName
Post by: nime5ter on July 03, 2015, 01:15:18 PM
Ah yes, thanks. An important clarification. You've mentioned this before, I just keep forgetting. :)
Title: Re: RoboB - GetElementByClassName
Post by: oswalsh on July 05, 2015, 03:10:08 PM
Just got back to playing with this, it works perfectly! Thanks you two.