reading a list of feeds
March 6th, 2005 Posted in Bash
Tonight I'm going to start off the script with reading a list of feeds, and fetching them for parsing.
#!/bin/bash BASEDIR="/mnt/usb0/mp3/podCast" FEEDS="${BASEDIR}/feeds.lst" while read URL ; do while read LINE; do echo $LINE|sed -n 's/.*<link>\([^<]*\)<\/link<.*/\1/p' done < <(wget -q -O - $URL) done < <(grep -v -e '^[;#]' -e '^$' $FEEDS)
We're using grep to filter out lines starting with ; and #, as well as blank lines. We could get fancy and validate the URL, but this will suffice for now.
If all we really wanted was a list of mp3 URLs, we could pipe wget directly through the sed command, but I have plans to parse out more than just the mp3. To keep our files organized and minimize network traffic I plan to also parse out the titles of the feed, show, and pubdates. We'll delve into the parser more tomorrow, for now good nigh, and happy bashing.