Remember Me
forgot your password?

HTML Character Entities, Problems For RSS Readers

It appears that HTML entities can cause RSS/syndication readers to fail when trying to read WordPress comment RSS feeds. Fortunately, a plugin has been written to resolve the issue. Entity 2NCR has a confusing name, but has a purpose that is easy to understand which is to convert various HTML Character Entities to their numeric equivalents.

HTML Explained

The Hypertext Markup Language (HTML) is a simple markup language used to create hypertext documents that are platform independent. HTML documents are SGML documents with generic semantics appropriate for representing information from a wide range of domains. It can represent hypertext news, mail, documentation and hypermedia as well as menus of options, database query results and simple structured documents with in-line graphics. It can likewise represent hypertext views of existing bodies of information.

The World Wide Web (WWW) has been using HTML since 1990, making it one if the most widely used computer languages in the world. The WWW, in turn, is most commonly used for HTML whose popularity is due to the fact that it is the coding technology used to publish content on the Internet or the web. Programmers were quick to recognize HTML's user friendliness due to the ease of learning it.

This ease of coding was significantly contributory to the proliferation of web sites. However, HTML is not a complete programming language because it lacks conditional tests and flow control statements. There are implementations that may offer extensions to the HTML language in order to accomplish these functions but are not actually part of the HTML standards. By embedding some suitable programming language code inside HTML, the power of real programming language is realized.

A character entity can be written in two ways in HTML. One is called the symbolic reference while the other is the numeric reference. Symbolic references start with an ampersand and ends with a semi colon. The description of the symbol which is generally a shortened version of the full expression, can be found between these two. The letters in the middle are case sensitive and are usually lower cased, though there are exceptions.

Numeric references also start with an ampersand and finish with a semi colon, but between them is a number preceded by a hash. These are less memorable than symbolic references but correspond only to just a single byte of data. This can be very useful if one is trying to optimize pages for minimum download time. Symbolic references are sometimes referred to as entity references while numeric references are also called decimal references.

Most unusual characters can be directly entered without any problem. However, HTML character entities can be used in case one does encounter a problem. Lines and paragraph are automatically recognized. A couple of blank lines are added when paragraphs are not recognized.

A character entity is a method used to display special characters normally reserved for use in HTML. For instance, the less than () are used as part of the HTML tag structure, thus both symbols are reserved for the use. If there is a need to display these symbols on one's site, character entities can be used.

Problems

Many WordPress users are running afoul of character entities appearing in their comment RSS feeds, which many RSS/syndication readers fail on. The WordPress Plugin - Entity 2NCR seeks to resolve this by converting various HTML character entities such as », &, © and so on to their numeric equivalents. This plugin is for RSS output, but can also be adapted to posts if the user so wishes.

Installation of the Entity2NCR is not needed if a user is running WordPress 1.5.1 and above. It will only result to problems due to the plugin's function having the same name in the WordPress core. Upgrading to the most recent version is recommended since the plugin is already incorporated. The Entity2NCR should first be deactivated from the Plugin Admin before the installation of 1.5.1. The user should likewise delete its file from the WP-contents/plugins directory since it will just unnecessarily take up space.

The Entity2NCR is installed by downloading the zip file, extracting http://Entity2NCR.php from it and uploading this to the WP-content/plugins/directory and activating the plugin in WordPress. Entity2NCR hits the standard assortment of HTML character entities plus some of the more unusual and obscure ones as well. While this plugin primarily focuses on RSS output, both from posts and comments, it can also convert character entities in the regular content on one's blog as well. At the end of the plugin for the add-filter lines, the user is to remove the comment for any WordPress function he/she would want Entity2NCR to work on.

The RSS 2.0 spec is too vague although it can produce feeds that are valid, accurate and useful. This means that the contents of the feed should reflect the best possible representation of the article content. The spec does not say however, what to do if an article title contains HTML code or entities. It also doesn't say a lot of other things. In fact, an entire industry has sprung around the service of interpreting and fixing the various semantic differences between feeds. RSS application developers need to agree on some basic answers to fundamental questions instead of making endless conflicting discussions that do not help in any way.

Attribute Values

An HTML author should always put attribute values into quotes in HTML, although the formal rules allow the omission of the quotes in some cases. SGML requires that all attribute values are delimited using either double quotation marks or single quotation marks. Single quote marks can be included within the attribute values when the value is delimited by double quote marks and vice versa. Authors can also use numeric character references to represent double quotes and single quotes or use the character entity reference " for double quotes. There are cases that the values of an attribute may be specified without any quotation marks. The attribute value may contain letters, digits, hyphens and periods. It is highly recommended to use quotation marks even when it is possible to eliminate them.

There are several reasons to always use quotes around attribute values. It is much easier since there is no need to memorize and recall the rules for allowable omission. Another thing is that quotes are always required in XML. When one's HTML file is later edited, it may easily be forgotten to add the quotes in attribute value that is edited in a manner which makes the quotes mandatory. One drawback of doing this is the effort of typing and extra storage and transmission time required which are quite minor issues anyway. Quotes constitute just a small fraction of an HTML file.

Rate this Article: 0 / 5 stars - 0 vote(s)
Print Email Re-Publish

Add new Comment



Captcha

  • Latest Communication Articles
  • More from Danny Wirken

Tech Trends: what lies ahead,A look at the top five trends that will stand out in 2010

By: Rainco | 31/12/2009
This decade has been one of the most crucial for the technology sector. It was feared that the start of the new millennium would be blighted by computers all over the world blanking out. Reason: computers were programmed to understand the binary database, which meant they would not recognise 2000. But it proved to be much ado about nothing and everything went well when the clock struck midnight on January 1, 2000.

Advantages of the Square Angular Tower

By: Olga Novia | 31/12/2009
Especially ideal for telecom towers, the square angular tower is well suited for as it provides the widest application, and ease of modification. The angular structure can be customized to handle many different loads, and can be constructed for many different height levels. The versatility of the square angular telecom tower makes it ideal for hub sites, microwave network junctions, forest fire monitoring, and air traffic control radar. Installation is simplified due to its modular design. This

Examsoon 642-582 Training Materials

By: aminalee | 31/12/2009
We bring Cisco 642-582 exam prepared under the supervision of Certified Professionals. These 642-582 study Notes are simple and accurate in their contents resulting in best 642-582 Exam Preparation.

Examsoon 646-102 practice test questions

By: aminalee | 31/12/2009
Examsoon 646-102 examination exam is written by IT professionals who had years of experience on IT certification exams researching, which guaranteed the quality and accuracy of the practice exams.

Examsoon Cisco 646-588 Training Tools

By: aminalee | 31/12/2009
Cisco 646-588 Certification Exam success begins at Examsoon.com, your exclusive IT Certification Training Partner. Cisco 646-588 Training Tools help you pass your Cisco 646-588 Certification Exam in your first attempt.

Examsoon 642-586 Certification exam

By: aminalee | 31/12/2009
Try our Examsoon 642-586, and we offer you 100% pass guarantee, otherwise 100% refund. Your best preparation method for coming Certification Exams is through our Certification Sample Questions and Certification Brain dumps.

Examsoon latest Cisco 642-515 braindumps

By: aminalee | 31/12/2009
At Examsoon, we offer Cisco 642-515 Study Guide, Cisco 642-515 Practice tests and Cisco 642-515 demo for free download that will ensure percent chances of your passing the Cisco 642-515. And In addition, if we fail to deliver your success, Examsoon refund your money too.

Examsoon 350-050 BrainDumps

By: aminalee | 31/12/2009
Cisco 350-050 exam is one of popular Cisco Certifications. Many candidates won’t have confidence to get it if just go over these excessive knowlege. Actually, Examsoon 350-050 braindumps are the fastest and smartest way to pass your exam and obtain your Cisco 350-050 certification.

Riya: A Big Leap In Visual Search Engines

By: Danny Wirken | 16/11/2006 | Communication
Watch out for new software that will give a new face to search engines. Rather, a program that includes faces in the search function. A new California-based company, Ojos, developed the online photo-based search service named Riya.

Web 2.0, A Guide For Newbies

By: Danny Wirken | 04/11/2006 | Communication
A couple of years back Bill Gates introduce the idea of Convergence to the public. It was a fresh idea that later became a catchphrase for the Internet Industry.

Trackback Spam Explained

By: Danny Wirken | 04/11/2006 | Communication
In most blog applications, there is a feature called Trackback, which allows the user to send a trackback or notification to a different site or another blog that the user referred to in his own blog.

To Blog Or Not To Blog: The Ups And Downs Of Blogging

By: Danny Wirken | 04/11/2006 | Communication
Whenever the subject of the phenomenon called blogging is raised, most people immediately think associated it with an online diary or weblog. The term weblog refers to key words. First is web from the World Wide Web and log, as in keeping a log.

Tips On How To Deal With Anonymous Comment Spam

By: Danny Wirken | 04/11/2006 | Communication
Have you ever experience being flooded with anonymous comments? If yes, then chances are you have been a victim of comment spam. As with everything on the Internet, spam had also evolved. They are no longer limited to email.

The Latest On WordPress Themes

By: Danny Wirken | 03/11/2006 | Communication
As WordPress and blogging become more and more popular, the list of customization options continues to grow. One can attribute that to each user wanting his or her blog to be unique or very much personalized.

The Exciting World Of Video Blogging

By: Danny Wirken | 03/11/2006 | Communication
When the idea of weblogs was first introduce online, it was an instant phenomenon. Suddenly just about everyone feels the need to create their own space online by writing their thoughts. Then podcasting was introduced—blogging in audio form.

What You Newbies Need To Know About Pay Per Click Ads

By: Danny Wirken | 03/11/2006 | Communication
Just about anyone who has been using the Internet in the last few years has no doubt come across the term "pay per click" once or twice. Pay per click is actually one of the less expensive, albeit efficient, forms of advertising online.

Submit Your Articles Free: Signup
Article Categories




Use of this web site constitutes acceptance of the Terms Of Use and Privacy Policy | User published content is licensed under a Creative Commons License.
Copyright © 2005-2008 Free Articles by ArticlesBase.com, All rights reserved. (0.61, 5, w3)