Recently in Perl programming Category
Listen to any kind of syndicated talk radio program and you'll usually hear about some companion website the program has. Usually, there are a handful of free things you can get on a program's website, but many of these sites have a pay-to-play members' area where the really good content is. This includes MP3 downloads of the shows, access to live audio and/or video streams, special behind-the-scenes content, forums, desktop backgrounds, etc.
The MP3 downloads are very convenient for people who don't have the luxury of sitting in front of a radio (or driving a car) for a solid three hours while a radio program is broadcast (with advertisements). It's also a boon for people who find radio advertisements annoying.
The only problem with the MP3 downloads is that theme music and produced portions of the program can not, by law, be included in the MP3 file because otherwise the MP3 would be a copyright violation.
Live streams, on the other hand, are not subject to the above described restriction because they're like a broadcast in nature. They're not a time-shift of the original program. So, if you listen to the live stream or even listen to a pre-recorded program as a stream, music and produced segments may be included.
I listen to the Glenn Beck radio program quite often. I used to download the MP3 files to listen to in the car, but it got annoying everytime Glenn and his producers would put together a segment like "Sportscasters at the 2031 animal-human hybrid baseball games", or "The History Of the Democratic Superdelegates" and I would hear Glenn say, "Listen to this... [pause] Oh man! That was great! Wasn't that great, Stu? Oh yeah! Alright! Dan? Wasn't that just the best? Yeah. Oh yeah."
I decided I needed to figure out how to save a stream.
I knew it was possible. Lots of software applications exist for any operating systems that will convert audio from a live stream into a static WAV file or similar. The open source program mplayer is one such example.
Breaking it down
First of all, I needed to figure out how the stream content made its way to my computer.
After I've logged into the Glenn Beck website as an Insider, I can click a link to listen to a stream of a particular hour of the program (or the whole program) in Windows Media format or RealAudio format. I figured I'd have better luck extracting the audio from the Windows Media format, so I went that route. Instead of just clicking the link and letting my web browser find some program that could handle the content, I saved the content to a file and then looked at the file.
The file it saved was a fairly straightforward XML file that looked something like this:
<ASX VERSION="3.0">
<TITLE>Glenn Beck</TITLE>
<AUTHOR>Premiere Radio Networks</AUTHOR>
<COPYRIGHT>Copyright 2008</COPYRIGHT>
<ENTRY>
<TITLE>Glenn Beck 1</TITLE>
<AUTHOR>Premiere Radio Networks</AUTHOR>
<COPYRIGHT>Copyright 2008</COPYRIGHT>
<REF HREF="mms://a0011.v67134.c6713.g.vm.akamaistream.net/7/0011/6713/v08060322/glennbeck.download.akamai.com/6713/_!/shows/2008/06/03/GLENNBECKWIN20080603.WMA?auth=blahblahblahblahblah" />
<REF HREF="http://a0011.v67134.c6713.g.vm.akamaistream.net/7/0011/6713/v08060322/glennbeck.download.akamai.com/6713/_!/shows/2008/06/03/GLENNBECKWIN20080603.WMA?auth=blahblahblahblahblahblah
</ENTRY>
<ENTRY>
<TITLE>Glenn Beck 2</TITLE>
<AUTHOR>Premiere Radio Networks</AUTHOR>
<COPYRIGHT>Copyright 2008</COPYRIGHT>
<REF HREF="mms://a0011.v67134.c6713.g.vm.akamaistream.net/7/0011/6713/v08060322/glennbeck.download.akamai.com/6713/_!/shows/2008/06/03/GLENNBECKWIN20080603_CLIP01.WMA?auth=blahblahblahblahblahblah" />
<REF HREF="http://a0011.v67134.c6713.g.vm.akamaistream.net/7/0011/6713/v08060322/glennbeck.download.akamai.com/6713/_!/shows/2008/06/03/GLENNBECKWIN20080603_CLIP01.WMA?auth=blahblahblahblahandblah" />
</ENTRY>
...and so on.
This XML defines the MMS URLs for each segment of the show. There are several segments each hour. These individual MMS URLs are what I needed to feed to the application that was going to convert the audio stream to a file. In my case, I decided to use mplayer because it's just so good at everything it does!
The command line for doing the stream-to-file conversion looks like this:
mplayer -vc null -vo null -ao pcm:fast:file=dumpfile.wav \
'mms://a0011.v67134.c6713.g.vm.akamaistream.net/blahblahblah...'
The real magic in the above command is where I use -ao pcm to tell mplayer to use the PCM file writer audio output driver (instead of sending the audio to my speakers).
This gives me a WAV file which I'll want to convert to an MP3 or Ogg-Vorbis file.
To convert a WAV file generated by the mplayer command above to an MP3 file, I use the open source lame tool:
lame -mf -q2 dumpfile.wav GlennBeck.mp3
Or, convert it to Ogg-Vorbis (the completely open and better-sounding-than-MP3 lossy audio codec):
oggenc -q2 --downmix -o GlennBeck.ogg dumpfile.wav
I've now covered the basic mechanical components of converting an audio stream into an MP3 or Ogg-Vorbis file. Next I automate it all.
Automation
Because I'm a long-time Perl junkie, I investigated how I could use a Perl script to act as the glue between the components and get the whole process of capturing a stream and converting it to MP3 or Ogg-Vorbis.
In the above walk-through, I manually logged into the Glenn Beck website with my web browser. To really completely automate this puppy, I wanted the script to log in for me. It didn't take me very long to figure out the Perl CPAN module WWW::Mechanize was what I needed to use.
WWW::Mechanize does several handy things for the programmer. It loads and parses web pages and can follow links, populate forms, and other basic kinds of interaction. It keeps track of its own cookies and session data too.
To get into the Insider area of the Glenn Beck website, members must enter their username and password on the Insider login page.
Looking at the HTML source for this page, I learned the form was named "aform", the username field was named "iUName", and the password field was named "iPassword".
I now had all the information I needed for WWW::Mechanize to log in:
my $agent = WWW::Mechanize->new(
cookie_jar => {},
);
my $resp = $agent->get('http://www.glennbeck.com/content/insider');
if($resp->is_success) {
$resp = $agent->submit_form(
form_name => 'aform',
fields => { 'iUName' => 'myusername',
'iPassword' => 'shhhhhhhh!', },
button => 'submit');
Walking through the code above: First, I create the WWW::Mechanize object with an in-memory cookie jar (cookie_jar => {}). Next, I use the object to get() the log-in page. If everything works well so far, I tell the object to find the form named "aform", fill in the username and password fields, and submit the form.
One thing I realized as I was debugging my script was that after I logged in on the Insider page, I was immediately redirected to another page. In order for my script to work, it needed to follow the redirect. This was an easy fix:
my $agent = WWW::Mechanize->new(
cookie_jar => {},
redirect_ok => 1,
);
The page I got redirected to has the links on it for the streaming audio, so I'm exactly where I want to be if I want to capture and convert the latest and greatest Glenn Beck Program audio stream.
WWW::Mechanize can find links within the page with a variety of methods. One of these leverages Perl's excellent support for regular expressions. You can also search for links by the order in which they appear. The link I'm looking for looks like this:
<a href="http://www.premiereinteractive.com/cgi-bin/members.cgi?stream=shows/GLENNBECKWIN20080604&site=glennbeck&type=win_show"><img src="http://media.glennbeck.com/images/common/header_media5off.jpg" name="icon5" width="26" height="34" border="0" id="icon5" onMouseOver="MM_swapImage('icon5','','http://media.glennbeck.com/images/common/header_media5on.jpg',1)" onMouseOut="MM_swapImgRestore()" /></a>
So, my script has the following:
$link = $agent->find_link( url_regex => qr/${datestr}.*win_show$/);
$resp = $agent->get($link);
This assumes I have a scalar variable $datestr that contains a formatted date for the show I want to capture.
Originally, I was going to use one of Perl's several XML-parsing modules to make sense of the XML in the stream link, but in the end all I needed was a regular expression to extract the mms: URLs.
my $xml = $resp->decoded_content; my (@urls) = $xml =~ m/HREF="(mms:[^"]+)"/msg;
This gives me a list of URLs stored in @urls. Now I just need to feed them to mplayer:
$i = 1;
foreach my $u (@urls) {
my $seq = sprintf("%02d", $i);
my @cmd = ( 'mplayer',
'-vc', 'null',
'-vo', 'null',
'-ao', "pcm:fast:file=${datestr}-${seq}.wav",
$u);
system(@cmd);
if ($? == -1) {
print "failed to execute: $!\n";
}
elsif ($? & 127) {
printf "child died with signal %d, %s coredump\n",
($? & 127), ($? & 128) ? 'with' : 'without';
}
else {
printf "child exited with value %d\n", $? >> 8;
}
$i++;
}
This little ditty creates an output file for each of the segment streams. These are named something like 20080604-05.wav.
When the loop is finished, I have several WAV files sitting on the disk. Now I need to somehow sew them all together into one big WAV file so I can convert it to an MP3 or Ogg-Vorbis file. For this, I turn to sox. I decided to have the Perl script generate a shell script to run all the sox and lame commands needed.
open FH, ">/tmp/${datestr}.sh";
foreach my $j (1..($i-1)) {
my $seq = sprintf("%02d", $j);
print FH 'sox ', "${datestr}-${seq}.wav", " -t raw - | cat >> /tmp/${datestr}.raw", "\n";
}
print FH 'sox -w -s -c 1 -r 22050 ', "/tmp/${datestr}.raw ${datestr}.wav\n";
print FH "lame -mf -q2 ${datestr}.wav ${datestr}.mp3 ";
print FH "--tt \"Glenn Beck Show - $datestr\" ";
print FH "--ta \"Glenn Beck\" --add-id3v2\n";
close FH;
Then, I run the shell script:
system('sh', "/tmp/${datestr}.sh");
Finally, I do a little cleanup:
unlink "/tmp/${datestr}.sh", "/tmp/${datestr}.raw", map({"${datestr}-$_.wav"} (1..($i-1)));
And, I'm done. There are many other ways I could have gone about doing this, but I found a way that worked and ran with it. I'd love to hear from people who have done something similar and how they did it.
I've been reading the book Liberal Fascism by Jonah Goldberg. The title of the book is guaranteed to set people off, one way or another and for this reason, Goldberg seems to spend a extraordinary amount of effort defending his premises and explaining that he's not saying that today's liberals are anti-semetic, genocidal maniacs. What he does say, and says very well, is that history's most common tales of fascism, such as Adolf Hilter and Benito Mussolini, were largely influenced by progressive thought--the same progressive thought that rules the Democratic party and liberal politics today.
In a July 2007 debate, Hillary Clinton responded to the question of whether she would refer to herself as a "liberal."
"You know, ['liberal'] is a word that originally meant that you were for freedom, that you were for the freedom to achieve, that you were willing to stand against big power and on behalf of the individual.
"Unfortunately, in the last 30, 40 years, it has been turned up on its head and it's been made to seem as though it is a word that describes big government, totally contrary to what its meaning was in the 19th and early 20th century.
"I prefer the word 'progressive,' which has a real American meaning, going back to the progressive era at the beginning of the 20th century.
"I consider myself a modern progressive, someone who believes strongly in individual rights and freedoms, who believes that we are better as a society when we're working together and when we find ways to help those who may not have all the advantages in life get the tools they need to lead a more productive life for themselves and their family.
"So I consider myself a proud modern American progressive, and I think that's the kind of philosophy and practice that we need to bring back to American politics."
At the time of this debate, I was reading The Forgotten Man by Amnity Schlaes which provides a new look at the political forces at play before and during the 1930s when the United States was enduring The Great Depression. What Schlaes reveals--and what many people don't know--is that Franklin Roosevelt's New Deal policies were formed with the help of a team of progressive advisors and cabinet members who had varying degrees of infatuation and admiration for Joseph Stalin and Benito Mussolini and the forms of government they were managing and/or advocating.
Schlaes offers that the policies of the Roosevelt administration were a significant input into why the Great Depression lasted for the entire decade of the 1930s while other industrialized nations around the world suffered an economic hit in 1929 and then recovered relatively quickly.
I mention this because, thanks in part to my friend Glenn Beck, I recently came across a number of platform statements and congressional records belonging to presidential candidate (and current frontrunner) Barack Obama that suggest he is ready to (blindly?) take us right into a repeat of the 1930s.
National work programs
The Roosevelt administration, in the interest of stimulating the economy and helping the large number of unemployed, created a number of government work plans including the Civilian Conservation Corps, a work program for young men, 17 years old or older. The CCC put these men to work in camps on various projects around the country such as clearing out dead wood in forests and building bridges, walkways, and roads, and other construction projects, usually in rural or undeveloped settings.
Last week, Barack Obama announced to Wisconsin auto industry workers that, as president, he would propose over $200 billion in programs to create new government jobs. The bulk of this spending would go to create a workforce of "green-collar workers" that would tackle environmental issues like finding new forms of enviro-friendly fuels. Other jobs would go to infrastructure projects such as highways and bridges.
While I agree that good hard work is good for the mind and soul and would benefit individuals who would otherwise be unemployed and potentially idle, I can't help but be concerned that Sen. Obama hasn't studied his history. Quite frankly, it doesn't seem like many on the left have studied their history because these types of programs are becoming quite a popular topic of discussion among liberals. If we know we're going into a period that may be like the 1930s, why would we do the same things that prolonged the suffering and the stagnation then?
The less-fortunate
Many Americans believe we have an obligation to help those who are less fortunate around the world. Liberals believe this should be a function of the federal government. Conservatives, on the other hand, would prefer this be done by private organizations and charities. One of the reasons conservatives feel this way is because the charitable feeling is completely lost when your money is forcefully taken from you by the federal goverment in the form of taxes and fees, no matter how good the intentions are. Plus, there is the issue of how efficiently those funds will be handled.
Senator Obama, along with fellow senators Chuck Hagel and Maria Cantwell, have sponsored legislation known as the "Global Poverty Act" which passed the Senate Foreign Relations committee this last week. If passed, this legislation would require that the federal government provide a small percentage of the economic GDP as financial aid for countries where people live in poverty. The US would not send this money directly to the people or their governments. Instead, we would give that money to the United Nations to administer the funds.
Again, when will people learn?! Our government created a formal "War On Poverty" after World War II and spent plenty of money on programs to help the poor improve their station in life. Did anyone actually rise out of poverty? Not according to statistics. Because of this and because the government continued to rise the poverty level to include less and less poor households, those who qualified for assistance under these programs grew.
1964, Ronald Reagan gave a speech titled "A Time For Choosing". In it, he addresses the inefficiency of the government's welfare programs.
"We are told that 9.3 million families in this country are poverty-stricken on the basis of earning less than $3,000 a year. Welfare spending is 10 times greater than in the dark depths of the Depression. We are spending $45 billion on welfare. Now do a little arithmetic, and you will find that if we divided the $45 billion up equally among those 9 million poor families, we would be able to give each family $4,600 a year, and this added to their present income should eliminate poverty! Direct aid to the poor, however, is running only about $600 per family. It would seem that someplace there must be some overhead."
He also talks about the overall ineffectiveness of cutting checks to those "in-need:"
"If government planning and welfare had the answer and they've had almost 30 years of it, shouldn't we expect government to almost read the score to us once in a while? Shouldn't they be telling us about the decline each year in the number of people needing help? ... But the reverse is true. Each year the need grows greater, the program grows greater."
Again, haven't we learned anything from our past mistakes? Why can't our political leaders learn what works and employ those techniques instead of playing the same old card again and again?
What works for poverty, unemployment, etc.? Not free handouts.
The LDS Church here in Utah has its own welfare programs which are available to anyone, regardless of church affiliation. These programs are not handouts. Instead. they are structured, compassionate programs that encourage the recipients to "give in" to receive. Meals, clothing, and other assistance are available to those in need and, in turn, the recipients are asked to give of their time and effort to help provide the same services to others. This is a perfect example of why private charitable organizations are much better equipped to deal with these kinds of problems than the bureaucratic nightmare of the federal government.
Obama's legislation states that it is all part of an international agreement to help combat poverty. This means that all participating countries will be taking a portion of their national revenue and giving it to the United Nations for distribution to poor areas. Two alarm bells go off when I ponder this: Global redistribution of wealth, a socialist policy tenet, and international taxation by the United Nations! When will the madness stop?
The United Nations is supposed to help keep the peace in sensitive areas of the world and it can't even do that well. Why would anyone think this organization would be effective and act responsibly in an effort to combat poverty? Oil for food, anyone? Do progressives, liberals, and socialists simply lack the ability to learn?!
Debt
The United States government, and by association, the citizens of the United States, are between $9 and $100 TRILLION dollars in debt. I fail to see the sense of spending more than what is required to maintain bare essential services until this debt is eradicated. Social programs, earmarks, museums, assistance programs... They should all be stopped or shrunk so that some of the government's revenue can be applied toward the outstanding debt.
History tells us Thomas Jefferson had much to say about debt, both personal and national. He stated it was vital that the country not take on debt and if it did, that it should be no more debt than could be paid for in one generation.
"It is incumbent on every generation to pay its own debts as it goes. A principle which if acted on would save one-half the wars of the world."
As a country, we have ignored Jefferson's advice since the beginning of the 20th century and now we are witnessing the effects of years of irresponsible borrowing in our economic outlook.
And speaking of irresponsible borrowing, Barack Obama has proposed a $10 billion federal fund to help "innocent victims" caught in the subprime loan mess. Are there really innocent victims? I don't think so. When you borrow money to purchase a house, you have plenty of opportunity to learn what you're getting into, what your obligations are, etc. The lending institutions are certainly not innocent either because they have time-tested methods for determining risk when lending money. What Obama is suggesting is essentially saddling us with more national debt because of a few people's irresponsible behavior.
Fiscal discipline and revenue
Barack Obama's website says a lot about a need for fiscal discipline and responsibility. I'm glad his website says these thing, but if he really believes in these things, how are these billions upon billions of federal programs going to be funded? There will have to be greater revenue to the federal government and/or less spending on programs that are already there. Obama's honest about this, if not direct about it. If you peruse his website, you'll learn he wants to cut spending on various programs and he wants to repeal the Bush tax cuts. Well, only for the rich, not for the poor or middle class taxpayers.
While repealing tax cuts for the rich is a popular thing to do (because there are a lot more people who aren't rich than are), it is, by definition, not fair. I would really like someone to explain to me why it makes sense that we pay a different percentage of our assets in taxes based on the amount of assets we have. To be fair, equal, and all that, shouldn't we each pay the same percentage?
What are the economic repercussions of saddling the "rich" with more taxes? The rich are more likely to spend more than those who are less wealthy, so this would cut into their spending power. The rich are more likely to employ others than those who are less wealthy, so this cuts into their hiring power. Hello?! Tax hikes on the rich is a direct attack on important driving forces of the economy: consumer spending and employment!
Our good friend Jesse recently posted an article on his blog saying, "NOW is the time [for Perl] to step up!!!" Jesse mentioned a discussion going on inside the Ruby on Rails community in which at least one significant member expressed frustration with the lack of intelligent software architecture methodology put into Rails. I think he called other RoR contibutors "a bunch of half-trained PHP morons," which goes a long way -- in my book, anyway -- toward describing something akin to building an automobile out of toothpicks and rubber cement.
Jesse suggests this is the perfect time for Catalyst to make a entrance. I couldn't agree more.
Catalyst is a Perl web application development framework that compares, in some ways, to Ruby on Rails. Catalyst does a fine job of providing developers with a solid MVC framework for developing web applications, but I think what makes Catalyst so formidable is that it also leverages much of the excellent Perl code available from CPAN, the global distributed repository of reusable Perl component code.
Yeah, Catalyst is awesome. I'm thrilled to be using it for a project at work right now. That may come as a surprise to people who have heard me describe KnowledgeBlue as purely-Java shop, but the company is adapting to better exploit the skills available just as I am taking steps to learn more about Java development.
Catalyst needs a lot more (well-written) documentation. There is an excellent tutorial -- Catalyst::Manual::Tutorial -- that is distributed in POD format as part of the Catalyst::Manual package, but even after going through this tutorial, a Catalyst newbie is likely to still be doing some head scratching.
The tutorial is great, really. It walks through setting up a connection to a database backend with DBIx::Class, creating templates using the all-powerful Perl Template Toolkit, and using the Catalyst tools to magically provide authentication and program flow.
The problem, however, is that Catalyst, can do so much, it can be difficult to grasp how to do simple things. The complexity of deploying a simple application approaches what a JSP developer must do. The difference is that after a JSP developer edits several configuration, source, and HTML files, he or she has a simple web application that says "Hello World." The Catalyst developer might spend the same amount of time and end up with a "Hello World" application in a very extensible MVC framework. From there, it requires a minimal amount of work to extend the application, for example, to send its output in PDF format or to get its "Hello World" message from a web service.
In addition to wrapping your mind around all that Catalyst can do and how to do it is the large number of Perl packages you must install. This is less of a burden than it used to be because a lot of Linux distributions provide the fundamental packages necessary for Catalyst development like Catalyst::Runtime, Catalyst::Devel, and Catalyst::Manual, but to really develop kick-ass applications, you've still got to install other packages like HTML::FormFu, Template::Alloy, DBIx::Class, and others in addition to their corresponding Catalyst glue modules.
I think this blog posting may be the first in a long series of brain dumps on Catalyst. I hope I can make it easier for others to transition into Catalyst development.
One thing I'm going to try to do before going back to work on 2 January is get a Perl application released to the general public that Iodynamics was working on partly for a client and partly as a pet project. It's called FileStore and it's a mod_perl application for giving users Web-friendly access to files stored on a server.
The original inspiration for FileStore was Apache::FileManager which is still available via CPAN. Apache::FileManager was great... until Apache version 2 came out. I spent some time in the Apache::FileManager code to see how hard it would be to get it working under mod_perl version 2 and ended up wanting to cause bodily harm to Phillip Collins, its author.
I originally wrote Iodynamics::FileStore for internal use within our company several years ago. A handful of clients saw the benefit of using it to grant employees Web-access to Samba servers, but otherwise it was relegated to use only within our network. While it was a mod_perl handler, it relied heavily on Lincoln Stein's CGI.pm module for, well, everything. All the HTML was generated on the fly from CGI.pm calls. While that made for a compact self-contained application, it made it difficult to maintain and customize.
Another big weakness with Iodynamics::FileStore was that there was no built-in user authentication. It was typically deployed in a location configured for basic authentication with Apache. Once you made it past the basic username and password prompt, you had as many priviledges as anyone else- usually "777" permissions to the filesystem being served by the application.
So, in early 2007, I started redesigning the FileStore application thinking we could turn it into some sort of Web service people might actually pay for: An online, Web-based file server accessible from anywhere. Something like Alfresco, I guess, but without all the crap that gets in the way of being productive. (Can you tell I'm not too fond of Alfresco?)
Stephen Weeks and I took the original FileStore concept and added rudimentary role-based user authentication/authorization functions. We also created Template Toolkit templates to supplant all the CGI.pm presentation code. I worked with David Baker to design some simple icons for things like "Upload," "Copy," "Rename," and "Delete."
Then, Iodynamics went the way of the cuckoo.
A month or so ago, David Baker asked me about the FileStore project- wondering if he could use it for a personal project. I dusted off the code and installed it on his server and started thinking about releasing it for public consumption.
I'm ready to start doing just that. Eventually, I'd like to get Apache::FileStore in CPAN, but a lot of administrative work will need to be done on the project before that can happen. A minimal test suite will need to be written as well as documentation in POD format.
In the meantime, here's a screenshot. I'll make a tarball available soon.

Great news for Catalyst developers using Fedora 7 or Fedora Core 6 Linux distributions: Core Catalyst modules are NOW available in FC6 extras and F7 repositories!
Yes, it's true! Just do a yum install perl-Catalyst-Devel and BAM! You'll be taking web development to a new level while staying in the comfortable world of managable packages.
(Note: Thanks to redbeard2 for pointing out my stupid typo: "are not available.")
Many years ago, we at Iodynamics decided we needed a wiki, so I found some simple Perl-based wiki engine. It was painfully simple. Absolutely no bells or whistles.
A couple years ago, I became familiar with Kwiki and found it to be a huge improvement over the plain-vanilla wiki software we had been using. The nice thing about Kwiki is the large number of plugins available. You can integrate Template Toolkit templates, have per-user preferences, WYSIWYG page editing, and include RSS feeds through the use of Kwiki plugins, which are really just additional modules available through CPAN.
Kwiki still has its drawbacks. It is difficult--or at least not very straightforward--to protect your Kwiki wiki from spammers. Many of the plugins suffer from stagnation and abandonment. The Kwiki community seems to have maybe... moved on?
Having been impressed with Kwiki's extensibility, I recommended it to some people for some sites. My buddy Dave had me set it up for a couple sites, including barbershopwiki.org, a site for barbershop-style singers and enthusiasts to share information.
The barbershopwiki.org site seemed to instantly become a hit within that small community of users, but their enthusiasm turned to frustration after a period of time when pages on the site were defaced with advertisements and links to porn sites.
Of course, with Kwiki's easy revision control features, it was easy to replace the defaced pages with previous, legitimate content, but it quickly became a major hassle to have to do this all the time.
The solution is to only allow write-access to pages to users you trust and not to anonymous users. This flies in the face of the wiki design principles upon which the wiki concept was established. That is, wikis are supposed to be open for anyone to edit, ensuring the free flow of information and the freedom for anyone to add to or correct information they find in a wiki.
Recently, Kwiki has been seeing some new development. Plans for a 2.0 release which addresses many of the issues I've mentioned are planned. However, from what I can tell, the development isn't very active, so it's anyone guess as to when Kwiki 2.0 is going to be available or, more importantly, when it will be production-ready.
Many people have suggested MediaWiki to me. I applaud the MediaWiki folks for creating such a popular piece of software and, for the most part, I love what wikipedia has become, but I can't help being biased against MediaWiki for the mere fact it's developed using PHP.
Not only that, but we have done some customization of MediaWiki for a client and found the underlying PHP code quite a bit more yucky than most PHP code. This also does not shed a positive light, in my mind, on MediaWiki.
So, I went off to search for other Wiki engines. It wasn't long before I landed at the home page for TWiki. The majority of people reading this are not familiar with or have tried very hard to forget that "Twiki" was the name of a short anthromorphic robot sidekick on the 1970s television series Buck Rogers in the 25th Century. Twiki was generally useless, walked around saying "beady beady beady," and annoyed all viewers over the age of eight.
The information I just shared with you is not at all relevant to the topic I'm trying to address.
Anyway, it seems that TWiki has taken an approach similar to Kwiki by providing extensibility options through CPAN modules, but they've built a baseline wiki platform that, quite simply, RAWKS!
After playing with it off and on for a couple of days, I decided the time was right and immediately replaced Kwiki with TWiki as the wiki engine for the barbershopwiki.org site. So far, it's been great.
Because the default installation of TWiki is more complex than, say, Kwiki, it's important that you carefully read the INSTALL.html file that accompanies the installation tarball. This file will walk you through setting up file ownership and permissions on files and directories, configuring Apache's configuration to securely host your TWiki site, and set up some initial user accounts with administrative priviledges.
Installing Catalyst on a Fedora/RedHat system can be a challenge - especially if you insist on keeping all installed software manageable via RPM. This is because many of the packages required to run Catalyst are not available from any common repositories and therefore must be built as RPM packages.
Eventually, I believe these packages will either show up in fedora-extras or a new third-party repository will be created for Perl modules required by Catalyst that are provided neither by fedora-core or fedora-extras.
In the meantime, I've created a straightfoward shell script called catalystinstaller.sh that drives yum and cpan2rpm to install everything you need to get a good start developing Catalyst apps on a Fedora Core 6 system.
I'm aware of one problem with this script, so far. When building Catalyst::Runtime. the module's configuration routines complain that the optional package Catalyst::Engine::Apache is not installed. It appears to want version 1.05 which is an older version than what is currently available. The currently available version is installed by this point in the process. So, just hit ENTER when it asks if you want to install it (the default is No).
This is a first release (v0.01). Run at your own risk. Please report any issues you run across and, of course, any suggestions you come up with.
I would love to set up a yum repository for these packages that are not available yet in the standard repositories, so, if someone wants to help with that project, please let me know. The hard part won't be setting up the repository, but keeping packages up to date.
Interested parties can download the catalystinstaller.sh script here.
I thought I'd share a short demonstration of some very cool Perl technology.
About a week ago, I got back from taking my family on a short vacation to beautiful southern Utah. I documented our trip and posted copies of a few photos we took in the Events section of my website. You can read the Sept. Mini-Vacation page yourself.
In the past, when preparing images for publication on the web, I've either hand-coded the HTML -- usually a mess of table elements -- or I've used a homegrown shell script called buildgallery.sh that generates the HTML table code around a set of images. After generating the scaffolding of tabl code for the image organization, I'd go in and populate various cells with image captions, descriptions, and links to different sizes.
This works, but tables are ugly. CSS-only presentation is so much more elegant. Plus, it'd be better if I didn't have to jump around the HTML to populate the caption and description text for all the images.
Perl to the rescue!
I'm already using the Perl Template Toolkit to generate static pages for each page on my website. I use the ttree command at the command line on my staging server to generate the pages by "wrapping" the main content.
Since the page about our southern Utah vacation was just being fed to a template processing engine to generate a resulting HTML file, it made sense to use the power of the Perl Template Toolkit to generate the HTML around my gallery of images.
By the time I've gotten to this point, I've already run a shell script that uses ImageMagick's, uhm, magic, to create sensibly-sized thumbnail and web-friendly sized versions of each image. Those images are stored in files with .med and .thumb in the names before the filename extension.
For each group of images, I declared a data structure called images which is essentially an anonymous list of anonymous hashes. Each anonymous hash contains information about an image such as the base filename, image dimensions, title, and description text.
[% images = [
{ basename => "fiesta_fun-all_3-2",
width => 156,
height => 117,
title => "Our kids driving the kiddie-carts",
description => "Maya, Lucy, and Eli on the kiddie-carts course."},
{ basename => "fiesta_fun-lucy_eli-4",
width => 156,
height => 117,
title => "Lucy and Eli",
description => "Lucy and Eli on the kiddie-carts course."} ] %]
For those unfamiliar with the Perl Template Toolkit syntax, it is similar (but not too similar ;-)) to PHP or ASP in that template code can be embedded with HTML as long as each piece of template code begins with [% and ends with %]. This is, of course, overrideable -- You can change those start/end sequences to anything you want.
Next, I declared a block called doimage which takes one set of information about an image -- one hash from our list of hashes -- and generates HTML for it.
This BLOCK section only needs to be declared once in the file and could be placed in its own file for re-use later, if I chose to do so.
[% BLOCK doimage %]
[% DEFAULT img_path = "images" %]
<div class="picture">
<div class="imagecontainer">
<a href="[% img_path %]/[% img.basename %].med.jpg" target="_new">
<img width="[% img.width %]" height="[% img.height %]"
src="[% img_path %]/[% img.basename %].thumb.jpg"
alt="[% img.title %]"
title="[% img.title %]"/></a>
</div>
<div class="textcontainer">
<div class="picture_caption">[% img.title %]</div>
[% IF img.description %]
<div class="picture_description">[% img.description %]</div>
[% END %]
</div>
<div class="floatbreak"></div>
</div>
[% END %]
It should be pretty obvious by looking at the code (unless you're reading from an RSS feed of this document, in which case, the above example will be all messed up) how the values of the individual hash elements are interpolated into the HTML.
Next, I need to loop through all the images by iterating through the list of hashes I declared earlier. For each one of these hashes, I want to process the information in the hash using the block I declared.
[% FOREACH img = images %]
[% PROCESS doimage img_path = "images" %]
[% END %] Finally, I dressed up the HTML by adding some properties to the stylesheet:
.picture {
margin: 5px;
margin-left: 20px;
border: solid #aaa 1px;
}
.imagecontainer {
text-align: center;
float: left;
width: 170px;
}
.picture img {
margin: 5px;
}
.textcontainer {
margin-left: 200px;
}
.textcontainer .picture_caption {
font-family: sans-serif;
font-size: large;
font-weight: bold;
}
.textcontainer .picture_description {
font-family: sans-serif;
font-size: medium;
font-weight: medium;
}
.floatbreak {
clear: both;
height: 0;
}(A big shout-out of kudos to Tene for helping me with that CSS.)
In summary, I don't think I'll be using any shell scripts to build HTML table scaffolding around my simple image galleries anymore.
I've already thought of some ways to improve upon this. For example, I could use the Perl Template Toolkit's Image plugin to examine each image file and automagically extract information such as image dimensions, EXIF data, etc. as it iterates over the list of image information.
I've been so busy lately, but tonight I took some time to get caught up on the Utah Open Source Planet and, I must say, there was lots of good stuff to read. Thanks to y'all sharing your knowledge. You rock.
I thought I'd pick on one of my favorite UOSSP bloggers, Aaron Toponce, but not in a negative way. I read his semi-recent entry about using HTML entities to obfuscate web site data in an attempt to foil robots -- particularly robots intent on harvesting e-mail addresses and other information.
Some years ago, I implemented this technique on several sites, personal and professional. It seemed to make sense the average spammer/data-harvester, was not going to implement the code necessary to de-entity-ize the content in search of e-mail addresses. In retrospect, however, I think that's a poor assumption.
See, spammers have money and they give their money to poor souls who will write code for money and, in many cases, have the smarts to pull it off. So, semi-smart coders tasked with maximizing the pool of e-mail addresses gleamed from a vast array of websites will very quickly implement techniques to foil the simplest of data obfuscation techniques. Converting text to HTML entities has got to be one of the first obfuscation techniques they are faced with circumventing.
After that, they probably implement simple OCR techniques to gleam data from sites that convert all their e-mail addresses into text rendered as image files.
That said, this HTML entity-based obfuscation technique is better than nothing, right? Because spammers like their pools of e-mail addresses to be fresh, it usually only takes a couple of weeks to see if any anti-spam technique results in a significant reduction of incoming spam, so it's easy to verify your technique is working. When we implemented the HTML-entity based obfuscation technique, there was a decrease in the amount of spam, but there was still plenty of spam.
If you're interested in playing with ways of automating the process of converting text data to a string of HTML entities, check out the HTML::Entities Perl module -- part of the comprehensive HTML::Parser distribution of modules.
Once you have this installed, you can do something like this:
perl -MHTML::Entities -ne 'print encode_entities($_, "\32-\255")'
For the Perl head-scratchers, this is a one-liner that loads the HTML::Entities module, wraps a loop around reading from STDIN or a filename parameter, and prints the result of the encode_entities() function call for each line of input read. Hit Control+D to get out of it.
[foo] /home/fozz 19 % perl -MHTML::Entities \
-ne 'print encode_entities($_, "\32-\255")'
Aaron Toponce
Aaron Toponce
When it was clear the HTML entity-based obfuscation simply did not have what it takes to win against increasingly smart harvesting bots, we deployed a CAPTCHA solution using the Authen::Captcha Perl module for our clients that really needed/wanted to publish e-mail addresses on their websites. This solution has worked out much better and, paired with educating users about the risks of leaving your e-mail address on websites, we've seen more significant decreases of incoming spam.
I've been working on the next generation of the Fozzolog weblog framework for a while now. This new generation is likely to be worth showing the rest of the world (*crosses fingers*).
The new framework makes use of mod_perl and many of my latest favorite Perl modules including, but probably not limited to:
One issue I ran into in development was how long it took to render a listing of all weblog entries, especially if there are many entries. My thoughts first went to the database, so I tried a couple things to optimize data retrieval. First, I dropped the body text from the entries in the SQL query so the results were smaller. That seemed to make things a tiny bit faster, but not much. Next, I used Cache::Cache to cache the results because a listing of all weblog entries doesn't change very often. That made a difference, but it still took six seconds to list all my entries.
