Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Hi all,
Mkay, i am sifting into the world of RSS... finally, but not for the safari side of things. For my personal site i want to add an RSS feed to Huomah (Daves site) and Site Reference, i dont want the java side because it doesnt add any real value other than just the usefulness to visitors, i want both sides (+content value). I am going through M1's php version at the moment, but for another business portal i am working on, i need an ASP version. I have already put on a javscript version but again, it doesnt add any content value.
Do any of you guys know of such a script in ASP?
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
hmmm... okay, fair enough - i feel a good bit "slower" now. Thanks Northie, i guess i was just looking in the wrong places, as far as i could tell, unless i came up with my own rss system? or aggregator or whatever i would have to use another sites rss parser which would have left my page covered in their logos.
Thanks a lot. Im sure that covers it for some others as well.
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline

doughnut (don't) worry - i usually turn to the forums first, rather then searching: It's easier to get someone else to do the work for you 
That said, anyone want to build a logistics management system for me?
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
lol... i wish it were that easy... it does however speed up the process when you know someone else on here has probably done it
On the topic though, (this one i have searched the crap out of) - if you look at the home page of the site in my sig you will see M1's rss script, i have tweaked it a bit here and there but what i am trying to figure out is how to limit the character length of the description without java script?
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Does this help or am i waaayyyy off base: http://www.site-reference.com/webmaster … parser.php
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
That would have been exactly what i was looking for except in ASP - plus the php version gives this "ML error: not well-formed (invalid token) at line 119" when trying to use SR feeds... I did actually come across that before i asked for some
(see, i may not always raid google but i do search SR
)
I think i have M1's pretty much waxed though, except for the annoying "this-is-the-title-of-this..." format of all the titles and the fact that i cant limit the length of the descriptions... not to worry though, im gonna attack the generated script from your link and see what i can pull out.
Edit: I got a feed to work, sifted through the generated code.. no character limitation. Oh well, poo happens right.
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
limiting characters is easy.
use substring
And some asp equivilents
http://www.webmasterworld.com/forum47/2827.htm
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Thanks Northie... i tried just about every variation of substring as i could think of... then i google it and tried many more but i cant seem to incorporate it into M1's rss code. Could you give me an example of how to do it?
Or throw a hint if you reckon you are making it to easy on me
Code: php
<?php
//
// ScarySoftware RSS parser
// Copyright (c) 2006 Scary Software
// ( Generates HTML from a RSS feeds )
//
// Licensed Under the Gnu Public License
//
// Here are the feeds - you can add to them or change them
$RSSFEEDS = array(
0 => "http://www.site-reference.com/xml.php?c=all",
1 => "http://rss.cnn.com/rss/cnn_topstories.rss",
2 => "http://rss.slashdot.org/Slashdot/slashdot",
);
//
// Makes a pretty HTML page bit from the title,
// description and link
//
function FormatRow($title, $description, $link) {
return <<<HTML
<dt class="feed_title"><a href="$link" target="_blank">$title</a></dt>
<dd>$description</dd>
HTML;
}
// we'll buffer the output
ob_start();
// Now we make sure that we have a feed selected to work with
if (!isset($feedid)) $feedid = 0;
$rss_url = $RSSFEEDS[$feedid];
// Server friendly page cache
$ttl = 60*60;// 60 secs/min for 60 minutes = 1 hour(360 secs)
$cachefilename = md5($rss_url);
if (file_exists($cachefilename) && (time() - $ttl < filemtime($cachefilename))) {
// We recently did the work, so we'll save bandwidth by not doing it again
include($cachefilename);
exit();
}
// Now we read the feed
$rss_feed = file_get_contents($rss_url);
// Now we replace a few things that may cause problems later
$rss_feed = str_replace("<![CDATA[", "", $rss_feed);
$rss_feed = str_replace("]]>", "", $rss_feed);
$rss_feed = str_replace("\n", "", $rss_feed);
// If there is an image node remove it, we aren't going to use
// it anyway and it often contains a <title> and <link>
// that we don't want to match on later.
$rss_feed = preg_replace('#<image>(.*?)</image>#', '', $rss_feed, 1 );
// Now we get the nodes that we're interested in
preg_match_all('#<title>(.*?)</title>#', $rss_feed, $title, PREG_SET_ORDER);
preg_match_all('#<link>(.*?)</link>#', $rss_feed, $link, PREG_SET_ORDER);
preg_match_all('#<description>(.*?)</description>#', $rss_feed, $description, PREG_SET_ORDER);
//
// Now that the RSS/XML is parsed.. Lets Make HTML !
//
// If there is not at least one title, then the feed was empty
// it happens sometimes, so lets be prepared and do something
// reasonable
if(count($title) <= 1)
{
echo "No news at present, please check back later.<br><br>";
}
else
{
// OK Here we go, this is the fun part
// Well do up the top 3 entries from the feed
for ($counter = 1; $counter <= 3; $counter++ )
{
// We do a reality check to make sure there is something we can show
if(!empty($title[$counter][1]))
{
// Then we'll make a good faith effort to make the title
// valid HTML
$title[$counter][1] = str_replace("&", "&", $title[$counter][1]);
$title[$counter][1] = str_replace("'", "'", $title[$counter][1]);
// The description often has encoded HTML entities in it, and
// we probably don't want these, so we'll decode them
$description[$counter][1] = html_entity_decode( $description[$counter][1]);
// Now we make a pretty page bit from the data we retrieved from
// the RSS feed. Remember the function FormatRow from the
// beginning of the program ? Here we put it to use.
$row = FormatRow($title[$counter][1],$description[$counter][1],$link[$counter][1]);
// And now we'll output the new page bit!
echo $row;
}
}
}
// Finally we'll save a copy of the pretty HTML we just created
// so that we can skip most of the work next time
//$fp = fopen($cachefilename, 'w');
//fwrite($fp, ob_get_contents());
//fclose($fp);
// All Finished!
ob_end_flush(); // Send the output to the browser
?>
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
which bit do you want to shorten?
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Just the description, but i think the [counter] part of it complicates matters
... im not entirely sure what purpose it serves though, especially considering this script only uses one of the feeds anyway.
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
Change the format row function to this
Code: php
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Wha... Northie... you are brilliant... you just made a cool rss script into a brilliant rss script.
Thanks 
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Awesomeness
great, thanks... i have applied it.
I tell you what Northie, none of this is a quick fix thats been implemented and forgotten... you have gotten me into the habit of looking up the functions i see and figuring out what they do and how they do it... its incredibly motivating when you see something so cool happen in so few lines. Much better than those tutorials i was trying to weed through 
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
If you can get your head round it, look at
http://www.php.net/manual/en/function.x … struct.php
I've got a good implementation of it somewhere which can take any xml document and doesn't need to know the tag names beforehand - it just spits out an array.
I'll post it back here if/when i find it
EDIT
Not found it yet.........but if you have DomDocument compiled into php then you can use something like this:
Code: php
function xml2arr($xml) {
$data = array();
$doc = new DOMDocument();
$doc->loadXML($xml);
$titles = $doc->getElementsByTagName("title");
$i=0;
foreach($titles as $node) {
//echo $node->textContent . "\n";
$data[$i]['title'] = $node->textContent;
$i++;
}
$links = $doc->getElementsByTagName("link");
$i = 0;
foreach($links as $node) {
//echo $node->textContent . "\n";
$data[$i]['link'] = $node->textContent;
$i++;
}
$descriptions = $doc->getElementsByTagName("description");
$i = 0;
foreach($descriptions as $node) {
//echo $node->textContent . "\n";
$data[$i]['description'] = $node->textContent;
$i++;
}
return $data;
}
Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
Found it!!!
Here's my function for getting a news feed from yahoo based on a keyword
Code: php
function GetNews($keyword) {
$url = "http://news.search.yahoo.com/news/rss?p=".str_replace(" ","+",$keyword)."&ei=UTF-8&fl=0&x=wrt";
$contents = @file_get_contents($url);
$xml_doc = utf8_decode($contents);
// Parse the XML document
$parser = xml_parser_create();
if(xml_parse_into_struct($parser,$xml_doc,$data_values,$index)) {
xml_parser_free($parser);
for($i=0;$i<=count($data_values)-1;$i++) {
if(trim($data_values[$i][value])) {
$data[$data_values[$i][tag]][count($data[$data_values[$i][tag]])] = $data_values[$i][value];
//-------------------////---------------------------------//
}
}
$z = 0;
//specific to yahoo results - the real link is a url variable which yahoo redirects you to. This strips it out. The '*' charactor is literal and marks where the news url is located in the yahoo url
for($i=2;$i<count($data['LINK']);$i++) {
$redirect_parts = explode("*",$data['LINK'][$i]);
$content[$z]['link'] = urldecode($redirect_parts[1]);
$content[$z]['title'] = $data['TITLE'][$i];
$content[$z]['description'] = $data['DESCRIPTION'][$i-1];
$z++;
}
} else {
$content[]['title'] = "Sorry, no news for ".$keyword." today!";
$content[]['link'] = "#";
}
return $content;
}
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Lol... going through that link you sent me, they would start off with examples using types of amino acids wouldnt they...
I am definitely going to have to dedicate some serious time to this, i think this evening im going to start from basics again just to set myself in the correct mind-frame with the correct understanding. I still have my "All in One - Php, MySql and Apache" Book so i must start grafting. The missus is going to a ladies thing tonight so it will give me a good start.
Also I think getting seriously into this will help me on the Java path as well.
That script is awesome, so you could set it up for people to search yahoo news from your site, and create pages dedicated to a specific key term and just auto generate the content compliments of yahoo?
What is your implementation of it? Either of the above or did i miss something?
Am i correct in assuming that $link, $title and $description are generic to all rss feeds? I notice the
Code: php
int xml_parse_into_struct ( resource $parser , string $data , array &$values [, array &$index ] )
Doesnt make any reference to either of them, and your script uses them directly? Makes sense if they are a generic thing... guesswork if they arent.
I havent seen the generated code though, so i would imagine if their are more values put into the array they are just ignored?
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)Moderator
From: Yorkshire, UK
Registered: 2006-08-19
Posts: 2817
I've been thanked 80 times.
Offline
mmmm, how bets to explain this.
xml is just plain text, formatted to the xml specification - using tags to describe data (tags can also be nested for association, just like in html)
So in my example of the yahoo implementation:
These 3 lines get the raw xml:$url = "http://news.search.yahoo.com/news/rss?p=".str_replace(" ","+",$keyword)."&ei=UTF-8&fl=0&x=wrt";$contents = @file_get_contents($url);$xml_doc = utf8_decode($contents);
the variable $xml_doc is now a string of xml formatted text.
This creates a thing (dunno what it is or what it does, but it's needed by the next part)$parser = xml_parser_create();
Now this is where the clever stuff happens:xml_parse_into_struct($parser,$xml_doc,$data_values,$index);
$parser is the thing from above
$xml_doc is the xml string
$data_values and $index are passed by reference (that's what the '&' means in the documentation) - so they are now available after this function has execute and contain some data. But it's not in a structure that represents the original document so we mash it up a little.
[skip a few lines to $z = 0;]
The variable $data is now an associative array. The keys are the xml tag names and the value is the text inside the tag (or another associative array - if the tag had nested elements).
We can now loop over this array to pull back the bits we want.
--------------------------------------------------------------
TIP:
use the function print_r() to show the contents of an array.
--------------------------------------------------------------
RSS is a standard way of laying out an xml document; so all rss feeds will have a link, title and description (as well as some other stuff like a channel)
--------------------------------------------------------------
On my site, articledocs.com i have used this implementation on may of the pages. For example:
http://articledocs.com/articles/best-we … design.htm
[there's soem .htaccess on there - but it is a real php page]
Member
From: South Africa, Port Elizabeth
Registered: 2006-08-23
Posts: 1910
I've been thanked 34 times.
Offline
Hectic... that all seems pretty straight forward. Does make me wonder though, why do people bother with the java rss parsers?
I am keen to come up with my own variation of all this but i think the idea is already stale (scrapers), surely it cant be all that complex a task to create though. Search google or yahoo for sites with words matching your choice of words, stop at twenty and display the title, url and meta description... Then again, sites are not generally of the same standard - at least not like rss as you explained. Oh well.
Thanks a ton Northie, You really have cleared a lot of things up for me, i guess this is where logic meets common sense and everything looks simplified.
My up and coming... soon to be real website... www.thewebguy.co.za (one day i will finish it
)| Never |


