jueves, 2 de octubre de 2008

To Validate or not to Validate, XHTML, Dojo and Custom Attributes

Well, I must admit it. Now I really confused, I was validating the XHTML output of the pages of our current project, and because we used some custom attributes to create some widgets using a combination of Javascript and HTML (a custom attribute is an attribute that is not part of the HTML or XHTML specifications, like in <div id=”dialog” custom_type=”dialog”>, the custom attribute is “custom_type”), I realize the pages will fail the validation process, and as I was suspecting they did, they fail. But I remember that a big and popular Javascript toolkit uses custom attributes too, so, I think, mmm, “let’s look how they did to make their html output be valid XHTML”, and what I found has confused me a lot.

Dojo does not produce valid XHTML, they remove the Dojo namespace from their markup, so their pages don’t validate anymore (http://dojotoolkit.org/book/dojo-porting-guide-0-4-x-0-9/widgets/general). Why? I don’t really understand why?, I really want to know it.

Well, I think they’re tired of the war: “The Parser” posts this:


There's a big religious war about whether pages validating or not is important and I don't want to beat that dead horse anymore.


Lucky me, Dojo was not my favorite Javascript toolkit, and in my current project I am not using it, I am using JQuery (Dojo fans, sorry, if you need to make your code valid, as far as I know, you will have to make changes to the parser!!).

But, what can I do now? Do I must forget about Validation issues?

Well I was looking deeper and I found a really nice post from Jonh Resig http://ejohn.org/blog/html-5-data-attributes/

It seems that HTML 5 (http://www.w3.org/html/wg/html5) includes some cool features!, like the data-attribute to declare custom attributes, so in the future, I will not have to deal with custom DTDs or custom XHTML Namespaces! (Namespaces are not used to validate XHTML, Thanks for the comment zcorpan!).

But, now I’m more confused! What should I do?.

Should I still try to make the pages validate or not?

Well I think I could use XHTML Namespaces to make it validate.(I will use the approach of the data-attribute in order to keep compatible with the HTML5 specification) So I will have to do some changes now. A little bit more of work, but at the end of the day I will feel better with myself. But seriously I’m already confused, because the future is not as clear as I was thinking... Ok, don’t you believe me? Take a look to these good articles:





Take a look to this post too:


miércoles, 1 de octubre de 2008

Regular Expresions


Hi, again…

Regular expressions are extremely useful, you can think on them like “wildcards on steroids” (I like that phrase I found it here: http://www.regular-expressions.info/index.html), they save you a lot of time when you do a search inside a string and want to return the matches based on a pattern or when you want to do replacements based on them. I don’t really remember when I started to use Regular Expressions but now I use them quite often. So here are some tips that I’ve found so far, in Javascript:

(If you already know a lot about Regular Expressions, please go directly to my seven tip, maybe you can find it useful).

1. You can match a specific character using its hexadecimal index in the character set: e.g.: \xA9 matches the copyright symbol in the Latin-1 character set.

This one is really useful when you’re working with Unicode, there are several considerations if you plan to work with Unicode, take in mind that Unicode may encode a single grapheme as two different characters, e.g.: “à” could be “a” and “`” so many regular expressions could fail mysteriously, and it could be hard to find why.

2. Use shorthand characters when possible, e.g. : use \d instead of [0-9] and \w to match any word character (alphanumeric characters plus underscore). The regular expression will be more readable.

3. Take in mind that “.” matches (almost) any character. Well this will work different in IE and in Firefox, in IE the “.” Matches even line breaks, and in Firefox line breaks won’t be matched. This difference is really important to remember.

4. Use capture groups if you plan to do something with the results, you can define a group using “(” and “)” like in (q)(u) (this will match q followed by and u and you will have 2 capturing group, to access each group use $1 and $2 for each group. Please note that the groups index begin in “1”, the group 0 contains the entire match not a specific group.

5. Lazy Repetition feature is useful but it can be expensive, like in <.+?> that matches any html tag (invalid tags inclusive!), if you now the exactly set of characters expected is better to use <[^<>]+>, it will perform better that the “.” Character.

6. You can create a regular expression in Javascript in two ways:

a. Using literal notation: /(q)(?=u)(u)/g

b. Using: var regx = new RegExp(“(q)(?=u)(u)”,g);

Take in mind that for the second way you need to be careful when you use the special characters like “\”.



Var reg = new RegExp(“((ht|f)tp(s?): / /)?((W){3}\\.)?(\w)+((\\.)(([a-zA-Z])+/?)?)+”,g)

Please note that the “\.” in the literal notation is translated to “\\.” when you use the constructor notation.

7. Make yourself a big favor (a really big favor, even if you’re a Regular Expression Guru like my friend Paolo, he is really good with Regular Expressions!), and consider use this free tool:



This online tool is as simply as impressive, it is really complete and the regular expressions created there work quite well in Javascript, you can construct your regular expressions and test them on the fly! (this tool comes to you as a courtesy of my good friend Hans who recommended me to use it, and now I simply can’t imagine work with Regular Expressions without it)

There is also a Desktop version located here: http://www.gskinner.com/RegExr/desktop/ (it needs Adobe Air installed)

or use this other tool:



is really impresive too, and had a lot of features and it is programmed in javascript , but i don’t know why I prefer the first one :P, maybe because it looks nicer.

So that’s all so far.