Thursday, February 11, 2010

Incorporating XML Content
Into Your Website

Here's an article that
I find interesting:

Incorporating XML Content
Into Your Web site (ASP)


The first thing I notice about
this article is that it is for
ASP. I assume he means
Active Server Pages, a
Microsoft technology.

The author puts ASP in
the title of his article.

The next thing I notice is that
you need a URL for your feed.
This makes sense. No URL, no
feed.

Ah! Here's another key element.
A way to format the XML.

I had not thought about this. I
had assumed that you need to parse
the XML and then turn it into
well-formated HTML.

I'm being shown another approach
here. It appears that the XML
is being left as is and another
language, XSL, is being
used to format the XML.

Here's a Wikipedia article on
XSL:

Extensible Stylesheet Language

OK. I'm starting to understand
now. A basic application of XSL
is to transform XML into
HTML.

The implications of this are huge.
This means you can load XML into
an HTML div element and
you don't have to parse it yourself.
You just simply load it and let
XSL interpret it for the browser.

I'm getting way ahead of myself
here. However, I think this is
what is being said.

Towards the end of the article,
the author says that you can
use either PHP or ASP
to parse the XML.

OK. Let me think this through.

Both PHP and ASP
are server-side languages that
are embedded in the web server
in some fashion.

So now I understand. There's
a module of some kind that does
the XML parsing for you.

However it is XSL that
sets up the rules for how the
transformation from XML
to HTML takes place.

This is making total sense.

I'm learning a new language here.
I'm learning about transformations.

Specifically, I'm learning about
XML Transformations.

In life, one thing leads to another.

Now that I know the central issue
when embedding an XML feed into your
webpage is an XML transformation,
I'm much better prepared to cope.

I now know to search on XML
Transformations
if I wish to
learn more.

An assumption I made at the beginning
of this post turns out to be
incorrect. I had thought that it
was the browser that does the
XML tranformaton.

Obviously not true for many different
reasons.

One reason it is not true is that
both PHP and ASP are
server-side. Therefore, it must
be the web server that transforms
XML into HTML.

Another reason is common sense. It
would be a horrible horrible browser
compatibility issue if every browser
did its own XML transformation.

It makes much more sense to do the
XML transformation in one place
and that one place would logically be
the web server.

A final reason for doing the transformation
on the web server is adoption. If
XML transformations took place in the
browser, you would have to get everyone
who makes a browser to go along.

In other words, every web browser would
eventually have to adopt XML transformations
along the lines of some standard.

Not likely.

So, for many many different reasons, XML
transformations are a server-side
phenonema, not a client side one.

OK. I'm wrong again.

Turns out that an XML transformation can
be either client or server. Turns out
that you can also use Javascript to
accomplish an XML transformation.

However, I think that it is probably the
case that most transformations are done
on the server side.

One advantage of doing an XML transformation
on the client side is that you off-load the
processing on to the client. In other words,
the client does the heavy lifting, thus saving
your server slow-downs and wasted machine
cycles.

A severe disadvantage to client-side XML
transformations is that the client (the
web browser) has to have Javascript enabled.

As you can see, I know very little about this.
However, I'm fast learning about XML transformations.

Here's what I think I've learned so far.
I'm a newbie, so all of this is subject
to change:

  1. ASP can be used to do an
    XML transformation on the
    server-side
  2. PHP can be used to do an
    XML transformation on the
    server-side
  3. Javascript can be used to do
    an XML tranformation on the
    client-side

If this means nothing to you,
think of it this way. Either
the web browser or the
web server does the
tranformation which turns your
web feed into a web page.

Ed Abbott

Finding CNN's Feed

I'm new to RSS. I just learned
how to find CNN's feed by going
to their homepage.

Here are the general steps:

  1. Go to www.cnn.com
  2. View the source HTML
  3. Look for the link tag
  4. Find the link tags that
    say the following:

type="application/rss+xml"

Ok. That's pretty general. Let's
get more specific.

It turns out that CNN's home
page has link tags representing
two RSS feeds. Here are the two
feeds as described by CNN:

  1. CNN - Top Stories [RSS]
  2. CNN - Recent Stories [RSS]

So, in other words, CNN has two
feeds. One is for top stories
and the other is for recent stories.

Must be that one feed has stories
that are currently dominating the
news and the other has stories
that were recently dominating the
news. Something like that.

Here's specifically how I found
out about these two feeds. I
took the following steps:

  1. Go to www.cnn.com
  2. Right click
  3. A menu appears
  4. Choose View Source
  5. Type Control-F
  6. A search box appears
  7. Search for link
  8. Hit F-3 (search again) until
    you find the first link tag that
    says rel="alternate".
  9. You just found the first feed
  10. Hit F-3 (search again) until
    you find the second link tag that
    says rel="alternate".
  11. You just found the second
    feed.

Here's what an empty (and rather
useless) link tag looks like:

<link>

An empty link tag is not at all
realistic. Link tags always
have attributes inside of them.

Here's what the link tags we
are searching for look like when
we include only the first
attribute:

<link rel="alternate"

The above is not really a link tag.
It is the beginning fragment of a
link tag. Link tags always end with
a > character (greater-than
character)
.

One more thing. Both of CNN's feeds
are RSS 2.0 feeds. This attribute
inside the link tag tells us this is true:

type="application/rss+xml

If CNN were to offer an Atom
Feed
as well, the type
atrribute
would look like
this:

type="application/atom+xml

Note that CNN offers RSS
feeds only
. At the time
of this writing, CNN does not
offer
an Atom Feed.

Oh, I almost forgot. There's one
more thing to look for inside
of the link tag. That's
the feed address.

Here's the attribute you look
for when you want to find where
the feed is kept:

href="feed.xml"

Of course, the name of the feed
is not likely to be feed.xml.

However, I think you get the idea.
The feed is an XML document
located at some URL.

Enough for now.

Ed Abbott

Atom Format Versus Atom Protocol

In my last post, I talked about
the Atom Standard:

The Atom Standard

A little review. Within the
Atom Standard, there are two
main branches, the format versus
the protocol.

At least, that's how I see it.
I'm new to this.

Here's the two branches:

  1. Atom Syndication Format
  2. Atom Publishing Protocol
    (AtomPub or APP)

I think that the choice of words
that makes one branch a format
and the other branch a protocol
is an interesting choice.

Choices matter. The choice of
protocol versus format
matters.

Here's a Wikipedia article that refers
to the Atom Syndication Format
as the Atom Format:

Atom (standard)

In explaining the Atom Format, the
article basically says that it is an
alternative to the RSS 2.0 Format.
In other words, the Atom Format has
emerged as a possible drop-in replacement
for the RSS 2.0 Format

I'm using language loosely here. When I
say drop-in replacement, I don't
mean that the two formats are precise
equivalents
. I mean to say that they are
near equivalents.

I'm still learning more about the Atom
Format
versus the Atom Protocol.

Suffice it to say, for now, that the
Atom Format
is a feed format and the
Atom Protocol is something else.

Ed Abbott

Wednesday, February 10, 2010

The Atom Standard

Here's a Wikipedia article
on the Atom standard:

Atom (standard)

According to the article,
it would seem that the
Atom standard is really
a pair of related standards:

  1. Atom Syndication Format
  2. Atom Publishing Protocol
    (AtomPub or APP)

I'm learning the difference as I
write.

Do I know the difference? I don't.

Seems like the Wikipedia article
makes this distinction and then
drops it. The article leads off
with this distinction and then
doesn't ever seem to discuss it again.

This document seems to describe
the Atom Publishing Protocol:

The Atom Publishing Protocol

I gather from skim-reading
that the Atom Publishing
Protocol
is largely based
on HTTP.

According to the Atom
Publishing Protocol
document
above, there are 4 basic things
you can do using HTTP methods:

  1. Get a resource
  2. Post a resource
  3. Put a resource
  4. Delete a resource

Sounds like a Get is
how you read a resource.

Sounds like Post is
how you create a brand-new
resource.

Sounds like Put is
how you update a resource
that already exists.

Sounds like Delete
could only be the deletion
of an existing resource.

OK. In my mind, at least,
this simplfies the Atom
Publishing Protocol

somewhat.

If all you are doing is four
basic functions, I can follow
that.

The four basic functions again
are:

  1. Read a resource
  2. Create a brand-new resource
  3. Change an existing resource
  4. Delete a resource

OK. Now I'm starting to understand
better.

The Atom Publishing Protocol
is not literally HTTP. I was
wondering about this.

Rather, HTTP is used to define
client-server interaction. I take
this to mean that the Atom
Publishing Protocol
is not
HTTP itself. Rather, it borrows
its basic way of doing things from
HTTP.

This is a pretty good example of how
knowing a little bit about one thing
helps you learn more about something
else. In this case, knowing a little
HTTP helps one understand the
Atom Publishing Protocol.

Time to wrap up this post.

OK. I started out thinking I was
going to write about the Atom
Standard
. However, I quickly
discovered that the Atom Standard
seems to have two major branches.

Here are the branches:

  1. Atom Syndication Format
  2. Atom Publishing Protocol
    (AtomPub or APP)

I got so caught up in studying the
Atom Publishing Protocol that
I ignored the other branch, the
Atom Syndication Format

I'll get back to the Atom Syndication
Format later.

Ed Abbott

Tuesday, February 9, 2010

Feed Buttons

This afternoon, I've been
studying feed buttons.

It seems that feed buttons
fall into two broad categories.

I'm a newbie here. So my thoughts
on this may change. However, right
now, the two categories of feed
buttons are:

  1. Raw XML feed buttons
  2. Feed buttons that subscribe
    to a specific reader, such as
    Google reader

This threw me when I first started
looking into this. The thing that
threw me is that raw XML feeds look
like web pages in your browser.

It's only when I did a right-click
on the page and asked to
View Source that I realized
that a feed is more than just a web
page. It is a page with lots of
underlying information.

In other words, there are really two
views that can be made of an RSS
feed:

  1. Human readable like a web
    page
  2. Machine readable as XML
    is a computer language that is
    frequently read and interpreted
    by a machine

OK. I think I'm finally starting to
get the basic idea of what a RSS feed
is. It is raw XML that looks like a
web page if you look at it in a browser.

This helps me as I've been stumbling
around in the dark here.

That's is to say, the illumination has
been coming in the last hour or so
but it has been coming slowly.

Ed Abbott

Google Reader

Here's the home page for
Google Reader:

Google Reader

Here's the official Google
reader blog:

Google Reader Blog

Looks like Google Reader is
a feed reader. That is to
say, you use it to stay current
with RSS feeds you have subscribed
to.

Ed Abbott

RSS 2.0 Specification

Here's where the RSS 2.0
specification is kept:

RSS 2.0 at Harvard Law

As you can see, the RSS
2.0 specification appears
to be kept on a web server
hosted by Harvard Law School.

Apparently, the creator of
the specification, Dave Winer,
worded there at the time.

The date on the specification
appears to be July 15, 2003.

An interesting aspect of the
specification is that it is
frozen. It is frozen with
the intent that other people
will build on top of it.

This is not such a bad idea
as it is hard to chase a moving
target.

I suspect the intent of freezing
the specification was for people
to start using the specification
rather than to continue to put
their creative energies into innovation
of the specification itself.

From reading the document, it appears
to be that all future innovation
is to take place using the RSS 2.0
specification but by a different name.

In other words, go ahead and add
to the spec but please come up with
your own name.

I see several advantages to this
approach:

  1. A specification by a different
    name has to build its own reputation.
  2. A specification with its own name
    lives or dies by its own reputation
  3. A specification by its own name
    will never reach critical mass or
    mass acceptance until it is worthy of it.

In other words, it would seem that Dave
Winer did not want other people riding
his coattails. Rather, he wanted these
people, and their specification, to build
their own reputation.

Ed Abbott

Wednesday, February 3, 2010

Trying Out Akregator

OK. I've just now tried
Akregator for the first
time. Akregator is my first
RSS feed aggregator.

I find that I don't yet have
to sign up for any feeds.
Akregator seems to come with
some feeds pre-installed.

In general, the pre-installed
feeds are Akregator and
KDE relevant. These are feeds
that it is assumed you might
be interested in.

After reading some of these
feeds, I already see that
partial feeds are a better
idea than I thought they
were in my last post.

I can now see that reading
half the feed is actually a
pretty good idea and then
finishing it in its original
form elsewhere works well too.

Seems that this is a time-
saver as it enables the
reader to decide how deeply
they'd like to go into the
post or article before making
a bigger commitment.

I may be assuming too much
here. I'm assuming that the
partial nature of the feed is
from the feed itself.

I suppose that truncating the feed
could also be a function of Akregator
itself. I haven't figured this out
yet.

In any case, you can click to
see the whole feed. The whole
feed will then appear in whole
form at its original URL.

If I right-click in Akregator, I
find I can ask to see the source
of the feed as a web page in my
browser.

In other words, I'm not looking at
the feed. I'm now looking at the
source of the feed, the original
writing on a web page or blog.

So far, I've found 3 ways to view
a feed in Akregator:

  1. As text inside
    of Akregator
  2. As a web page
    inside of Akregator
  3. As a web page in
    the default browser

You get the view of the
feed as text by default
in Akregator.

You get the view of the
feed as a web page inside
of Akregator by clicking
on Complete Story
at the bottom of the feed.

You get the view of the feed
as a web page in the default
browser by right-clicking on
the web page view of the feed
within Akregator.

Seems to me that this is a
cascading order of viewing the
feed. Here's the order in
which you would naturally view
a feed:

  1. First you see the feed
    as text within Akregator
  2. Next, should you click on
    the Complete Story, you
    see the feed as a web page
    within Akregator
  3. Next, you can take this
    one step further and right
    click to get the feed to
    appear in the default browser

OK. Starting to see how this
all works.

Ed Abbott

Akregator Versus Liferea

Ok. This post is for people
who use Linux as their operating
system.

If you are a Windows user or a
Macintosh user, this post is of
no interest to you. Read no
further.

Read no further unless you are
at least considering Linux.

Here's a guy who has written a
nice article comparing two RSS
feed readers, Akregator and
Liferea:

RSS reader:
Akregator vs Liferea,
and who won


I'm a RSS reader newbie myself.
I've never ever used an RSS
reader.

Based on his review, it seems
to me I should try Akregator
first.

I'm already using the KDE interface.
For example, Kmail is my email
client.

So it only makes sense for me
to continue in the KDE direction,
unless I really want to learn
something new.

Always I try to leverage what I
already know, if possible.

I was pleased to read that Akregator
searches through RSS feeds very quickly.
I'm sure this will be very important
to me in the future.

Time to give Akregator a try.

Ed Abbott

RSS Full Versus Partial Feeds

In my limited understanding,
there are basically two ways
you can use RSS to feed your
content to others:

  • A partial feed
  • A full feed

A full feed is when you give
your feed audience the whole
article.

A partial feed is when you
give them just the top of
the article.

There are advantages to both
ways of doing things.

A partial feed is really a teaser
of sorts. You are teasing your
audience into clicking here
for more information.

In other words, click here
to read the whole thing.

The advantage of the full feed is
that you do not have to do that.
You do not have to click anything
to read the whole thing.

Here's an article about full feeds
versus partial feeds:

The End of the
RSS Full Text Free Ride


The article refers to partial feeds
as summary feeds. It refers
to full feeds as full text feeds.

Same idea.

It seems to me that the real intent
of the partial feed is to get the
reader to come by for a visit to
your website or blog.

The intent of the full feed, on the
other hand, is to distribute the
information as widely as possible
without anything in the way to inhibit
or slow down the flow of information.

I suspect that as I learn more, I'm
going to favor the full feed over the
partial feed.

The more I think about it, the more I
realize that I don't really like doing
things halfway.

That's my very inexperienced and
unknowledgable opinion at this time.

I'm an RSS beginner so my opinions of
full versus partial feeds will undoubtably
mature over time.

Ed Abbott

Tuesday, February 2, 2010

Huge RSS Feed Button Collection

Here's a webpage with an
absolutely huge RSS feed
button collection:

Buttons for RSS

I'm laughing to myself as
I write this. I had no idea
how many of these buttons were
out there.

This page also seems to be
a wonderful tutorial. However,
I've only looked at it superficially.

Here's the first page of the
tutorial:

Wizard Creek Consulting.
RSS Tutorial Introduction


From the first page alone, I can
see that this man is a very dedicated
tutorial writer.

His name is Les Bain and I can see he
has worked hard to make his tutorial
a very good one.

Interestingly, he has a link blog as
well for RSS. Here it is:

RSS - Really Simple Syndication

By link blog, I assume he means that
he links to other people's material
on RSS.

There's a lot here. No doubt about
it. I could spend days and days
pouring over this material, studying
it, and learning more.

Looks like Les Bain is one of those
great sharers of information.

Ed Abbott

Robert Teeter on RSS Feeds

Thank God for people who share
information freely! Here's a
wonderful web page from someone
who I don't know named Robert
Teeter:

RSS: What it is,
Where to get it,
How to make it,
How to use it


It's clear to me that this guy spent
a lot of time making a lovely web
page that other people would benefit
from.

I'll be studying this page some more.
Looks like it has a lot of good
information.

OK. I already see one good tip and
that is to look for RSS icons at your
favorite site.

I'm going to be on the lookout for
these buttons and will write more about
them in the future.

Ed Abbott

History of Web Syndication Technology

Here's a Wikipedia article on
the history of web syndication
technology:

History of web syndication technology

This article gives the history
of web syndication technologies
leading up to RSS 2.0 and
Atom and beyond.

Sometimes it helps to know the
history.

More to Web Syndication
Than Meets the Eye


One of the things I find interesting
about web syndication is that it is
one of those technologies where there
is more here than meets the eye.

There are usually two qualities that
make a technology one that has more to
it than meets the eye:

  1. It is simple
  2. It has broad implications

Web syndication seems to have both
of these qualities.

The idea that you are going to summarize
what you've published on your website
or blog is a simple one.

Basically, you are sharing with the
world when you do this.

However, this idea also has broad
implications. There's so many different
directions you can go with it.

I Love Lucy


It reminds me of the story I once heard
about the I Love Lucy show.

Seems the network allowed the rights
for re-broadcast to the show to go to
Lucy and her husband Desi Arnaz on the
condition that they would pay for the
original broadcasts to be filmed out
of their own pocket.

The network saw little potential in the
re-broadcast rights, what we now call
syndication rights. That's how the story
goes.

Of course, we now know that these syndication
rights are where the real money is. Syndication—
a simple idea with broad implications.

Syndication of Web Properties


I suspect that syndication of web properties
may have some of the same implications. Right
now, web syndication is in its infancy just as
television syndication was in its infancy in
the 1950s and 1960s.

I once watched a television interview with a
hollywood producer who said something to this
effect, There's always too much money chasing
too little talent in Hollywood.


So it is on the web. On the web, there is always
too many web pages chasing way too little content.

This being the case, web syndication is important
because the one thing that people seem to lack on
their websites is actual content.

You could go blind, looking at some of these websites
that look so beautiful, but are so devoid of content.

Just as TV can be a vast wasteland at times, so it is
with the web. Syndication helps to distribute the
best content a little more evenly.

Ed Abbott