simpleXML PHP tutorial learn to parse any XML files and RSS feeds
Articles

simpleXML PHP tutorial learn to parse any XML files and RSS feeds



how long in this lesson you will learn how to use PHP to parse XML feeds an XML API data from any of your favorite websites or from websites that your clients request dynamic data feeds from we will be tapping into the simple XML extension of PHP for just one of its functions you can parse RSS and API data from all of the popular websites online that happen to syndicate their information so this is what we're going to be doing within this video lesson I'm going to go to YouTube then I'm going to go to develop PHP and then I'm going to go to Ted and I'm going to access their data feeds I'm going to snoop out where they have links to their RSS data feeds I'm going to find them I'm going to get the URL to those data feeds and then I'm going to create an XML object out of those URLs and then I can parse it just like it's a regular XML object okay so that's what we're going to show you how to talk it's no problem it's very easy it's like two lines of code maybe three don't worry the first thing you want to do is create a new blank PHP file you can really name it whatever you want but if you want to name it like I have mine named it's feed parse example PHP so within it let's create a new PHP scripting block so we'll open up the PHP tag and then we'll close it now in the first line let's create a variable we're going to call that variable HTML because that's going to be our HTML output variable at the end of the script we're just going to echo that HTML variable so actually let's go ahead and do that now let's put echo HTML variable there we go that's going to be really the last line in our little example here right now HTML is empty so this would echo emptiness to the page what we're going to do inside of a loop is build up on this HTML variable now let's create a variable called URL because that is the variable is going to store or hold the value of the URL that is the data feed or the syndicated feed that whatever popular website happens to have running now I said we were going to target YouTube so let's go to YouTube and what we're going to do is try and find their data feeds so I'm going to go to the bottom and click developers now that link should lead me to their data API s which are right here and let's see I'm going to click the PHP link at the bottom and here's their developer guide for PHP and really I don't have to tap into their API I mean you can if you want to but you don't really have to you can just tap into the feed in a raw sort of way if you have the URL so what I'm going to do is go down until I find where it says show let's see searching for videos videos uploaded by a specific user that's what I want I'm going to target my specific channel you can target your channel or any channel that you like and you can look at all of this information they have here all these different URLs point to different sort of data feeds and you can set them to be any kind of feed that you like in many instances right here says feed identifier you just supply it with the identifier here that you want and that's the kind of information that you get access to but what I want is uploaded by a specific user so I'm going to grab this right there that's the URL that I'm going to work with and I'm going to put it right there and I'm going to change this from username because that's just a generic placeholder to flash building I'm going to put my channel name right there so now we have our URL to the data feed that we want from YouTube in place and ready to go okay now we're going to create our XML object so let's put XML as the variable name for that XML object and the simple XML function that we're going to tap into is called simple XML underscore load underscore file and then you open closed parentheses put in your semicolon and then you just add the URL variable as the parameter to that so once you load this URL into this simple XML load file function you get an XML object from it and now we can work with that within a loop so let's go down 1 9 and let's type in four and this is going to be a four loop so you can open closed parentheses open your curly brace go down a couple of lines and close off your curly brace now within the parentheses for a four loop you have to supply three parameters and I'm using a forward loop because I want to have a ceiling on the amount of data that I'm going to get back maybe I only want ten items maybe I only want five items to display from that data feed maybe it has a hundred items in it and I don't want that many so that's why I'm putting the loop plus the loop helps us keep everything lean so I don't have to have a whole lot of code to render out many many different items many many different items will come rendered out through my for loop okay so for loop I said gets three parameters so the first one is the variable we're going to name the variable I and it's going to be equal to zero to start off with and then you put semicolon in the next parameter is the condition logic so we say I is less than ten so this is what it means first you created a variable called I and you gave it a starting value of zero then your condition is as long as I is less than ten this loop is going to run and process code that's within it and the way I changes from zero all the way up to ten is this I plus plus which increments a number so you'll go all the way from zero to nine actually which will give you ten items and within this for loop is where we're going to access all of the API data and or they feed data and we're going to pack it into this HTML variable now before we can put some code in there we have to know the structure of this data feed so what we're going to have to do is go to that data feed within our browser so within your browser you can just put into the address bar that URL and it'll show you a feed we can see the feed right here and it depends what browser you're in whether you'll see the actual feed data or not so you see the feed data now what I'm going to do is go down to the bottom because I find that's the quickest way to get to what I want to see you can see the closing tag is bead and then each video since this is my channel at YouTube that we're targeting for data feed each video is represented by entry so I can target feed and then entry and then that will give me an array of all of the entries or somewhat like an array even though it's it's XML data structure this is really the first things I need to see to be able to tap into what's going on here and then I can check out the rest of the nodes like there's link see that link node I can target that node and it's probably a title node for the title of the videos so let's see if that works let's go back into the code let's place in a new variable for title that's going to be equal to our XML object and then we go into the XML object into feed and then we go into entry from feed so we're just accessing the nodes of the XML there then we're going to place in brackets and within the brackets we're going to put our I variable and that I is going to be changing each time through the loop the first pass through the loop this I is a 0 the second pass through the loop the eye becomes 1 all the way until it reaches 9 because that's the ceiling we have so we're going to get 0 through 9 which is 10 items that we want out then after we access entry we go into the title node let me just type in title and that should give us 10 YouTube video titles from my youtube channel my latest 10 videos just the titles of them now we have to make sure we put that into the HTML variable so let's snatch up that HTML variable put it right there and then we'll put dot equals and we put dot equals to make sure we're appending to that variable and not overwriting it with data each time so we're going to output within the HTML that title actually we can output some actual HTML there so we'll put the title within double quotes and then you can wrap that in h3 or h2 whatever you want or I'll just wrap it in paragraph tags that way each one is separated down the page now let's see if we get any success with that if we do we can build upon it and get more information out of that data P okay here's my example page and as you can see I don't get squat so I got to change something let's see if we do it without accessing the feed node and we just go straight into the entries let's try that okay now refresh and there we go so you can see I have 10 of the latest 10 videos that I released here's their titles so that means I've successfully tapped into the YouTube data API or the data XML feed for my channel now I can go about accessing the description of the videos the ID and all the other information that I want to grab out of it okay so here's the raw source view of that feed from YouTube again so what I'm going to do is just want to highlight a bunch of this I'm going to take it into my text editor I'm just going to see what's going on with it I'm going to take about right there go into my text editor I'm just going to create a new it could be any kind of file I'm not even going to save it you're going to paste all of that right there and I want to see what these nodes are set up like so you can see there's the feed node closing the entry node closing right there we have a lot of other different nodes so what you want to do is go in and separate these all out so you can see the individual nodes okay so just so things are really clear I broke it down to the individual little nodes that I want to access and I removed all the content from them because that's not really important what is important is the name of the node so what I want to do is get the ID title content and author name so what I have so far is the title so let's get the ID now what we have to do is go in pop in ID here and ID right here then we'll get the description of the content in this case YouTube calls the description content so the description for the video is content let's just call it content and the last thing we want is the author and name so what we'll have to do is go one more node in on that one to get to the author name so let's grab this and actually let's put that one up top so let's put author then go one more node into the name element will call this name or actually let's just call it author that makes more sense all right that's all the data I want for my little loop all right so what I'm going to do here is change these paragraphs to developments because they're going to have more lines in them now and the title I'm just going to wrap with the maybe h3 element and then I want to also add the content under the h3 element so I can put that right there and then the ID I'm going to put right under the content so I'll just put a break tag right there their ID will be under the content which is the description for the each respective video and then the author we can put that in there as well we'll put that right in front of ID then we'll just put a hyphen or something like that now see how that renders out actually here let me put a HR element between each pass in the loop there'll be an HR element that renders okay we'll refresh our test page here and there we go is the first video or the latest video that I put up and here is the description for the video there's the title in the first line you can see so each first line in the h3 element holds the title and you can see the description comes after the title then we have the author name right there and the ID now if you just wanted the video ID this link might not be good for you because this is actually to a data feed this whole link but if what if you just wanted the ID you can easily do that in PHP where is that ID you would just put the string replace function to work on it so let's put ID equal to string underscore replace open closed parentheses put a semicolon now let's go back in the browser and let's take a look let's get that link and this is the characters that we want to delete out of that now watch what I'm going to do here my first parameter of the string replace function is what I want replaced which is this string the second parameter is what I want to replace that string with in this case nothing that's why I just want to delete that part of the string and then I can just put the ID variable there because that's the data that I want to affect so the string replace function takes three parameters what you want to replace in a string what you want to replace it with and then the string that you're doing the replacing upon now let's see what that gives us and we should see just a video ID that way you could actually render the video to the page if that's what you wanted to do if you wanted to actually render the video using YouTube's embed code you can get the ID in this way very easily so let's refresh this and we'll see if this whole link changes this whole ID link changes to just the video ID here let's refresh and there we go we got access to just the video ID now check this out we can go to youtube and grab any embed code for any video so where is that share I think if we go to embed see that embed code right there I'm going to grab that I'm going to go here into my HTML and before my H our tag I'm going to pop it in that code and you'll see we have to escape any double quotes so I'm going to change those to single quotes so I don't have to put any back slashes in my code just change all these double quotes single quotes and there we go and now the video will actually render there on the page Oh actually you got to change this to the video ID which was right here ID we take that and you put it right there that way it's dynamic for all of them on the page all right so we'll refresh the page and see we get so you can see underneath each title description we have the video actually rendering you can see it's all the actual proper video that should be there for the title and the description it's pretty cool huh so that's how you can get a lot of automated data flow on your websites or your clients websites or whatever without having to do very much look at that script is tiny and since we have most of the code structure in place all we got to do is change just a few things here and there because all structures or most xml RSS data structures for various websites are all setup different so you're not going to find the same node names and node structure on YouTube as you would on Ted or develop PHP or things like that some of them conform to the same node naming but you won't find the same node names and structure on every feed that you come across so that's why you got to dig into it the way I did and break it down the way I broke down the entries from YouTube and got all of the little nodes that I wanted to get out of it now we'll go to develop PHP comm and see if we can access one of its feeds so we'll click the little RSS syndication button down here and we can do any feed we want let's get the library syndication and let's look at the code for that the source code you can see mine is a little bit neater and it's all dynamic so all my latest material would be in here so let's go ahead and we don't really even have to grab any of this code we can see what's going on clearly we have the channel we have items so I think we can just target the item for that so let's see where we have let's get rid of name so we don't have any of that there and let's just work on title we're not going to be replacing ID we don't have that let's just get the title first let's remove this iframe that rendered that video from YouTube we don't need that in this one we just want the title at first so that's all I'm going to render so there we go let's see let's get rid of this content as well so instead of entry I'm going to put in an item because you can see my XML has no entry tag I have item tags within my item or all my various little nodes for each little entry for each item and then you can see I have title as well so I can target category title link description the GU ID is permanently pub date and anything that's within that item element let's just see if titles render first I should get ten titles of the latest library material added there oh I got to make sure I have the correct URL as well I'm still on the YouTube URL so I can go here copy link address and then put this right into place right there to replace where the YouTube data feed link was you can see mine says dot PHP what that means is mine is a dynamically rendered feed it uses PHP to access my database and that's how my RSS feed is rendered through dynamic MySQL data but it pumps it all out in XML structure that way anybody can tap into it using these sort of methods so let me save that and now we'll give this a test okay back at that same page let's refresh you can see I get not now let's see what else I can access I can access the let's just grab the description and I'll get the pub date – let's do pub date and I'll get the description as well so let's just copy this line let's put two more this one we're going to call pub date take that put it over here and this one we're going to call description take that and put it over here now we can add those let's add the description right under the h3 and then update we can maybe put a break tag in there put pub date as the last line then we'll have an H R right there now let's press refresh and there we go now we get the title we get the description and we get the date all right see that shows you how you can just go around attacking different fees no matter what their structure is like we're not attacking them you can just strip all the data you want right out of them display it any way you like and now to show you how easy and versatile all this is we're going to go to Ted and we're going to chew up on it air feeds okay so here I am at Ted calm and this is a site that I enjoy because they have very stimulating intellectual videos a lot of times sometimes it's crap but some all most of the time is very intellectually stimulating so and it's all videos it's like lectures is really cool stuff and they also have a YouTube channel as well but let's go down to the bottom here and you can see the little RSS icon right there so let's get TED Talks via RSS feed click that that's going to take you to looks like feed burner but that doesn't matter this is the feed right there you can see it all rendering so let's view source right click view page source and you can see this is in fact XML structure material so we can go down to the bottom again and look and see what we have here we have channel that's sort of like the develop PHP structure I have Channel and what else do they have they have link title let's see if they use item yep I see item right there so you know they're accessing item as well so what we're going to do in the code let's just start with title like we've been doing the others and then we'll know we can access everything else so let's remove these also right there that way just the title will render and we have to make sure we have that URL right here just grab this URL right up top add Feedburner and in the code we're going to put it right there so this should allow us to get the 10 latest titles of the videos from Ted comm we get the title rendering then we can delve into the rest of the XML structure and get at the other nodes so here we are at our example page once again it's refresh and there we have it so you can see that the structure of the Ted file the Ted feed is a little more like the develop PHP feed because I didn't have to change any of this they had channel node and they have item nodes and they use title as well so now you know that you can dig into the rest of this feed and you can get to the link you can get to the pub date the category whatever you want so let's go ahead and access the category just for example sake right here and let's just type in category we'll name the variable category and we're going to render that variable within our HTML let's put it right under the actually we'll put it in front of the h3 tag will wrap category within an h2 element so let's see what that gives us let's refresh so you can see they're all higher education videos let's see what else we can access about each video or each entry let's get that description so we can copy that go back into our code just put it right there for now copy that title line paste it right there cut that out and paste it right there so we have description now and we can put description underneath this h3 element now let's see what we get refresh our page and now we have a nice description for each video now I'm not going to waste your time or mine by taking this any further you know by now that we can access any of the any of the nodes that we want within any xml structure from any site any feed from any site so now you know how to go to any popular site that you want if they have api's data feeds RSS syndication you can tap into those XML structures and rip all the data that you want out of it and custom design the data presentation with HTML and CSS just like you would any other and we use a nice little for loop here to give it a ceiling so we can get out just as much as we want and not more okay I hope you've enjoyed this video lesson and I'll talk to you very soon

44 Comments

  • Rich Williams

    would you ever save an xml file to a dbase or always spit out to a page direct from the simplexml parser?

  • D P

    Thank you so much! Awesome tutorial Adam! I'm very new to php, but I was able to dynamically populate a website with XML using this tutorial. My page loads picks out only upcoming events from an XML file of events. Using
    foreach($xml->event as $event){
    if($event->date > $Yesterday) {
    $i++;
    }
    Would you be so kind as to help me go further? Given that one of my child nodes is $xml->event->group, how would I select only the 1st 8 of the XML file that belong to a particular group like let's say "meetings" and not show other groups like say "workout" or "dinner" that are in the same XML structure? Then of that group further target for each event with only the
    if($xml->event->date > $Yesterday){
    (Display only upcoming meetings)
    }
    ?? While I'm able to filter only the events that are an upcoming date, OR events the belong to a group of "meetings", I can't figure out how to filter only the meetings in the upcoming dates. Thanks in advance for your help.

  • 2kOLF

    PERFECT!!! I can't tell you how long it took to find this, a straight forward, simple, easy to understand video! Thank you so much!!! I have one question. How can I change the formatting of the time stamp given by a certain feed?

    The feeds I'm parsing show the times as <created>2017-07-20T02:17:50Z</created> and the other feed shows it as <pubDate>Thu, 23 Feb 2017 11:12:56 +0000</pubDate>

    Ultimately what I'd like to do is, not show the hour, minutes and seconds.

    I've searched for the answer, but again no one seems to be able to give a perfect explanation like you.

    Thanks again!!

  • J. Richard Kirkham B.Sc.

    Mahalo for this brother. I watched this when I was first getting into php. I couldn't at that time get my feed reader to work. Now I get it and realized how helpful and clear your video is. If I may share I like to <hr> tag, but if you ask for more videos than exist you get blank hr lines. So I added this

    if($title != ''){

    $html .= "<a href='$link' target='_blank'><h3>$title</h3></a>";
    $html .= "$description";
    $html .= "<br />$pubDate";

    }

    Basic stuff I know and most of you probably already know this.

  • aussie wilmar

    what about for these? how can i do?

    <guid isPermaLink="true">
    <![CDATA[
    ]]>

    <link></link>

    <content:encoded>
    <![CDATA[
    EUR/USD: Bullish: Break of 1.1000 could lead to acceleration higher.

    EUR has tested our patience but finally started to move to hit an overnight high of 1.0987. While the 1.1000 target that was first indicated last Wednesday is not met, the strong surge higher yesterday bodes well for our bullish…<br /><img src="https://www.forexfactory.com/attachment.php/2295826/alloy-main.jpg" alt="" />
    ]]>

  • Ronald Lee

    using this method I had a problem to get author's value :$author = $xml->channel->item[$i]->dc:creator; how can I handle this element of <dc:creator>? thanks

  • Neil Robinson

    Very helpful, cheers 🙂
    I found that using this method in Joomla, the items would not display on new lines, so I added a plugin to Joomla, DirectPHP, and that sorted the problem out. Don't forget to enable the plugin after installing.

  • Dado

    This is helpful, but I have another question – How can I read my own (custom) feed with my own feed reader using options drop down menu? How can I send you the code that I wrote, you could check where the error is? Thanx

  • Alexandru Hertug

    wow .. thank you! was perfect for me. I just had to copy and paste all you did. keep going the good work my friend

  • Carlos Oceguera

    This still works, I've used it many times… the only drawback is that it doesn't explain how to extract images from description tag. Can any one help explain how to do it?

  • Williams Bonnel

    Can someone can help to use this API to build a html page on my website who get data from the api ? thanks if someone can contact me for that
    http://www.residentadvisor.net/api/dj.asmx?op=getcharts

  • The Christian Business Network Forum

    I'm a dual certified teacher and by trade an in home computer tutor here in Honolulu Hawaii. You assumed nothing and gave great, simple explanations. I wouldn't be surprised if you were a trained teacher as well.

  • The Baylis Code

    Is there a way to search through an xml file with php similar to searching through a mySQL database and displaying the results?

  • Bob Ray

    what am I doing wrong, error msg

    <?php 
    $html ="";
    $url= "http://www.npr.org/rss/rss.php?id=1001";
    $xml = simplexml_ load _file ($url) ;
    for ($i = 0; $i < 10; $i++) {
    $title  = $xml->channel->item[$i]->title;

    $html .=<div><h3>$title</h3><>/div><hr/>";

    echo $html; 
    ?>

  • Frank Gonzalez

    I'm trying to grab digits from an rss feed then compare that against a text field to see if the numbers match up. Any advice?

Leave a Reply

Your email address will not be published. Required fields are marked *