Project

General

Profile

Actions

Bug #217

closed

IMDB Scraper Function for option "Enable full cast credits"

Added by Indy about 12 years ago. Updated almost 12 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Scraper (Music/Video Metadata
Target version:
Start date:
16/09/2012
Due date:
% Done:

100%

Resolution:
fixed
Affected Version:

Description

More Details in this thread:
http://www.xbmc4xbox.org.uk/forum/viewtopic.php?f=9&t=431

The imdb.xml in the common folder is missing a function for fetching the
"Full cast credits" when this option is enabled in the IMDB scraper.
The result, if this option is used, is no CAST is fetched.
Per default, this option is disabled.

As a solution, a new function GetIMDBFullCast has been introduced to the file.
The function was taken from the XBMC mainline where it is named ParseIMDBFullCast.
The DIFFs for the 2 updated files against the ones delivered with the
3.2 release (same as the most recent ones on the trunk).
The DIFFs are also attached as files.

File: system/scrapers/video/common/imdb.xml

DIFF:
diff XBMC4XBOX-3.2-STABLE/system/scrapers/video/common/imdb.xml Update/system/scrapers/video/common/imdb.xml
51a52,65

<GetIMDBFullCast dest="5">
<RegExp input="$$2" output="&lt;details&gt;\1&lt;/details&gt;" dest="5">
<RegExp input="$$1" output="\1" dest="6">
<expression noclean="1"><table class="cast">(.*?)</table></expression>
</RegExp>
<RegExp input="$$6" output="&lt;actor&gt;&lt;thumb&gt;\1_SX512_SY512_\2&lt;/thumb&gt;&lt;name&gt;\3&lt;/name&gt;&lt;role&gt;\5&lt;/role&gt;&lt;/actor&gt;" dest="7">
<expression repeat="yes" clear="yes" fixchars="3,5" trim="3,5" noclean="1,2"><img src="(?:([^"]*\.)[^"]*(\.jpg))?[^>]*[^"]*"nm"><a href="[^"]*[^>]*>([^<]*)<[^"]*"ddd">([^<]<)?[^"]*"char">(.*?)</td></expression>
</RegExp>
<RegExp input="$$7" output="&lt;actor&gt;&lt;thumb&gt;\1&lt;/thumb&gt;\2&lt;/actor&gt;" dest="2+">
<expression repeat="yes" clear="yes" noclean="1,2,3"><actor><thumb>(?:(http.*?)|_SX[0-9]+_SY[0-9]+_)</thumb>(.*?)</actor></expression>
</RegExp>
<expression noclean="1" />
</RegExp>
</GetIMDBFullCast>

Furthermore, the imdb.xml of the scraper has been updated to use the function in case
the option is enabled.

File: system/scrapers/video/imdb.xml

DIFF:
diff XBMC4XBOX-3.2-STABLE/system/scrapers/video/imdb.xml Update/system/scrapers/video/imdb.xml
123c123
< <RegExp conditional="fullcredits" input="$$2" output="&lt;url cache=&quot;$$2-fullcredits.html&quot; function=&quot;GetIMDBCast&quot;&gt;$$3fullcredits&lt;/url&gt;" dest="5+">
---

<RegExp conditional="fullcredits" input="$$2" output="&lt;url cache=&quot;$$2-fullcredits.html&quot; function=&quot;GetIMDBFullCast&quot;&gt;$$3fullcredits&lt;/url&gt;" dest="5+">

I have tested the moviescraper with both options (Full cast en- and disabled)
and everything seems to work fine now.


Files

imdb.xml_scraper.txt (377 Bytes) imdb.xml_scraper.txt Indy, 16/09/2012 02:03 PM
imdb.xml_common.txt (1.22 KB) imdb.xml_common.txt Indy, 16/09/2012 02:03 PM
imdb-patch.diff (2.65 KB) imdb-patch.diff Indy, 19/09/2012 08:49 PM
Actions #1

Updated by buzz about 12 years ago

please note diffs should be supplied in "unified format". This should be done from a trunk svn checkout, so changes can be applied with a single operation and it makes patches more readable. "svn diff" from the root of trunk would give the required output.

Actions #2

Updated by buzz about 12 years ago

I am unable to apply the patches in their current format. are you running windows or linux - on windows please use the instructions here - http://tortoisesvn.net/docs/release/TortoiseSVN_en/tsvn-dug-patch.html - and all patches should be against the current svn HEAD

Actions #3

Updated by buzz about 12 years ago

  • Status changed from New to Feedback
Actions #4

Updated by Indy about 12 years ago

Patch created with Tortoise.

Actions #5

Updated by buzz about 12 years ago

  • Assignee set to buzz
  • Target version set to 3.3

thanks.

Actions #6

Updated by buzz almost 12 years ago

  • Affected Version set to 3.2

finally getting around to looking at this. it seems to me, that we just need to fix up the existing GetIMDBCast function which was broken, rather than making a new one - ie. your code should replace the current IMDBCast. Anyway, I'm going to sync up with imdb for xbmc mainline, so it should fix this up plus right now scraping doesn't seem to be working at all.

Actions #7

Updated by buzz almost 12 years ago

you are actually probably right that there should have been a full cast and non full cast functionality, or perhaps in the past a single function managed both but can no longer due to site changes. I might have to do the same for directors/writers also.

Actions #8

Updated by buzz almost 12 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 0 to 100
  • Resolution set to fixed

imdb scraper updated in r31686 (until next time it breaks). thanks for the heads up regarding full cast. Updated full cast is included amongst other fixes.

Actions

Also available in: Atom PDF