Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-settings.php on line 472

Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-settings.php on line 487

Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-settings.php on line 494

Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-settings.php on line 530

Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-includes/cache.php on line 103

Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-includes/query.php on line 21

Deprecated: Assigning the return value of new by reference is deprecated in /home/burlyman/public_html/blog/hosting-includes/theme.php on line 623
BurlyHost.com, Inc. Web Hosting Blog » Perl


Posts Tagged ‘Perl’

Simple ways of keeping your web site and hosting account safe.

Friday, October 10th, 2008 by Justin M.
del.icio.us Digg Facebook FeedMeLinks Furl Ma.gnolia NewsVine Netscape Reddit Slashdot SphereIt SpurlStumbleUpon Technorati YahooMyWeb

Keeping a site safe from malicious users (often referred to as “hackers”) depends on a great number of variables. All of our staff have faced these questions and concerns from clients at one time or another, or have responded to questions posed on a news group or forum community. Here, we’d like to outline some of the things users can do to in an effort to help protect themselves from such problems, with the below summary.

Firstly, it is a misconception that “nothing is 100% secure”. While it is true that very few services or scripts people often use are known to have a history of no known security issues, some things have a proven track record of being secure, such as Dan Bernstein’s Qmail MTA mail software and his dbjdns (tinyDNS) DNS services software. While we don’t claim there is no way to exploit his software, no one to date has been able to, with years of public challenges and a prize of money and credit for anyone that can.

Similarly, many of us being programmers ourselves and having written thousands of programs with thousands of lines of code each for popular web sites that have people very intend on breaking into them, we also have a track record of no exploits against our own developed scripts and programs. This is not to claim perfection, or say that it’s not possible, especially in light of API’s and wrappers the scripts run through, the environment they run in, the interpreter engines they use for languages such as Perl, PHP, Ruby and Python, but to simply point out that it’s not an issue to view as fatalistic, hopeless or to come to accept it as inevitable.

I know that programs I have written, and those of my coworkers, would be quite comfortable putting up a challenge or standing by them with the claim that they are, in fact, 100% secure. Coding securely is not a difficult task, if you are an experienced and knowledgeable programmer. Anticipating potentials and taking measures to prevent aspects you didn’t anticipate are a key role. Such as denying all functionality and input by default, and then build the functions to accept specific input, and to do so in a controlled manner. Selecting the proper environment and how it will run, the privileges it has, and above all, sanity and security checks, will go a long way. Of course, many scripts out there weren’t coded with these things in mind and many do suffer from exploits, which we’ll attempt to cover a little way down this blog.

The same thing can be said of account security itself. While there may me a lot to it, and thousands of specific incidents and potential security holes in a lot of software floating around out there, a few general things can help to prevent a lot of these, without specific targeted solutions for every type of software and its known or potential exploits. A few simple things can go a long way for a good basis, and some suggestions along the way can help as well. Should you follow such advise, truly, then combined with even only a reasonably secure server, you should not find you experience any of the site compromises that a very high percentage of the causes are (literally, 99% of the time).

If you use our services, then you already can rest assured that we take all reasonable and custom measures available to prevent a high level security issue, but ultimately your own account and script security will dictate how secure your site is — no matter how well we secure our services, if you host a script that acts as a gateway into your account, there’s only so much we can do to protect your site. We do offer some services that are unique to BurlyHost.com, Inc., which will help you to protect your site, even if you happen to be lazy about your site’s security, but, as our CEO says, “if you create a security hole in your site by how you use and store your own passwords, or what scripts you choose to run that might be insecure, then all bets are off”. While you shouldn’t worry, you should take the matter seriously. The below suggestions will help, and should you follow them, it would be unlikely that your site would ever be compromised.

#1: Keep up to date on all scripts. First, consider if you want to run certain types of scripts. Some are almost specifically asking for trouble, though most common one’s are safe in general. Subscribe to common security alert announcements and mailing lists, and especially keep up to date on any security issues that are posted on the script author’s site of the software you use. They will often be the first to release security patches you can apply to your current installs, or offer a new version that you can either install in place or your current one, or more likely, to allow you a means to upgrade. Of course, you will want to make a safe backup of the scripts, data and databases you are going to patch, replace, or upgrade, but be sure that backup isn’t publicly accessible, else you might forgot about it and have both a newer secure version running along side of an older version with a known security issue. The unfortunate drawbacks of this, is that is can be time consuming, especially if you use a large number of scripts, and even more so if you use a lot of scripts from a lot of different authors, for a lot of different interfaces. However, it is an absolute necessity. A major reason for sites being compromised, is that a lot of people install a script and leave it for years without ever updating it. Remember, just because it works, still, doesn’t mean it doesn’t suffer from a bug or security issue suddenly being introduced. Don’t be lazy about this. If you don’t have time, hire someone that does, or partner with someone to help you with your site that can help keep watch for updates and security notices related to your software/scripts.

#2: Don’t leave install/test scripts. Don’t start something and not finish it, this can spell disaster. Don’t ever install scripts as a test and leave it enabled, or any vital portion or extension of it. Many people will try out software, scripts or some new module or extension for a script they are either trying out, or one that they actively use, and then forgot about it, either not revisiting the install or idea or the extension on the programs they do actively run, and they can be left as sitting time bombs. Should someone access it and guess a URL location, they could use that against you and your site. Additionally to this fact, is that new installs often, until the setup is complete, will allow control over certain aspects of your site, so if someone accessed it before setup was complete, or if you kept the default password/login or set up an easy one because you weren’t sure if you wanted to use the script, then an attacker would still have access to run an install/setup script, which can allow for a lot of methods that most scripts won’t naturally allow due to the security ramifications.

#3: Passwords. This is a major factor as well. Always use secure, strong and unique passwords. Never use dictionary words. Never use the same password for your scripts, forum logins (including admin logins), databases or anything else, that you use for something else. In other words, if you use a password for your main email, FTP, control panel or other login, do not use this same password when setting up a database or when setting the admin login for a script, even just for the install steps. Never use the same database passwords for different databases. Never use the same password for a forum account that you use somewhere else. Clearly, this can start to become a hassle if you have a lot of scripts, databases, logins, etc., but once the database passwords are set, you shouldn’t have to remember them. They are going to be set in your script’s configuration files, and you can write it down elsewhere.

People far too often prefer to simply ignore the problem and instead go for the ease of using the same password across all of the dozens of web sites they visit online. If someone were to guess or obtain your password from one of those services online or otherwise, imagine how much data they could have access to. Some people use the same passwords for a web forum online somewhere, as they do for their hosting account, or for their online bank account, or for their paypal account, ebay account or even for their own forum admin login. Someone could log into your forum admin area and abuse that script to do things to read other passwords you have and then try those at the other sites you might visit. Always chose to go through the hassle instead of making it easy for a malicious user. Remember, if you’re using the same password to make it easy for you, all you’re doing is ensuring that you’re an easy target.

Also consider other than just where you use the password(s), but how you store them. The best method to store them, is to have a master password and use some method to have a local text file or database on your system, and that master password unlocks it. If you have a lot of passwords, consider having a file containing them, if you can’t remember them all, and putting an abbreviation of the site or service, without actually saying what site or service it’s for (that only you would know), and then consider encrypting that file with something like PGP, and make it so that difficult to remember, single password, is strong (only you know it). We also recommend against ever using a birth date, zip code, pet’s name, child’s name or birthday, favorite band, video game character, nickname, etc., even for a password reminder. That sort of information is far too easy to locate online, especially with how willing people are with the information they reveal about their lives in forums, blogs and their bio pages and so on.

#4: Script and program configurations. Pay attention to what information you supply when configuring a new program or script. Some settings will be more secure to not enable, and some will help the security of the install, even if from fraudulent signups, orders or memberships or posts on your site and script. Enabling login attempts and failures, if available in the script, is helpful, as are captcha (image code verifications) and so on. If given the option, try not to install the script at a common location, especially if it’s a popular script, and even more so if it’s an admin directory for the script. Should any exploits be known and you’ve not been keeping up on updates (let’s face it, most people just don’t do it), then an automated exploit “bot” can hit your site and submit and auto-compromise the exploitable feature in the script. Another good tip, is to disable access by adding password protection for a sub directory (such as an admin directory) that not everyone in the world should have a reason to access. Other options are to do such things as using a deny/allow rule and deny all IPs from access to the admin directory except from your IP or IPs (or IP ranges) known by yourself or staff you want to allow access to.

#5: Permissions and Ownership: One important thing, is permissions of your files and directories. Our servers are set up so you can execute scripts through the web server over http (and secure https) with lower permissions than a lot of script’s install instructions tell you to use. Instead of using a world writable permission (777, 666, etc.), you can set scripts to run with chmod 700, so only your user had read, write and execute permission. “group” and “other” don’t need even read permission, so only your user can view and run the script. Since your user is the user that actually executes CGI and PHP scripts on our services, you can ignore any instructions to set world read/write or execute on your script installs. In fact, world writable scripts will error and not run, for your protection. Any files that are intended to be world readable, and are non executable (not scripts), should be chmod 644 (allowing only your user read and (re)write permission, and the web server only read). Directories should be chmod 755 for allowing directory listing to show the files located within it to web users, or chmod 711 to allow files to be accessed within it, but not allow people to see all of the files located within it. Chmod 700 can be used for directories that will hold data that’s only intended to be read from and written to a PHP or CGI script and not be directly publicly accessible, though you can also consider putting the non-directly publicly accessible files outside of your web root directory path. This (the lower permissions) prevents web users, as well as any other users on a shared server from viewing or running or modifying your files, as well as prevents them from seeing your script’s source code or passwords on the server.

#6: Secure connections (SSL/TLS, etc.) Another important factor, is to ensure that, when possible, you use SSL for connections for sending sensitive data. Ultimately, SSL, tunneling, TLS, sftp, ssh, etc. are not really at risk, as someone would need super user access on the server or your network (or the remote network) to control and listen to data being passed over the network, to “sniff” out passwords or other data, but if you have an office where there’s a LAN/WAN or you connect out to the WWW (Internet/Web) over a connection where someone can have a listening device, or if someone did take control of the remote server or network, that they could at least not likely obtain your data, such as the password you use. Unfortunately, if someone has root access on a server, even sending data over SSL still results in plain text data being submitted to the server, if someone with that access modified the script or service on the remote end to capture the data sent over SSL, but that would only be the server side — it wouldn’t work on the network side and all bets are off at that point anyway (but it would still help to protect you in such situations so there aren’t multiple points of failure in a security sense and what data is able to be obtained for all services, if someone did compromise one). This is mainly just good security practice and behavior, since “just in case”, you want to take every measure to prevent a third party from being able to obtain the sensitive data you send and receive, so it’s a good thing to do as a personal policy. We don’t intend to get into the gory details about all of it, and where it really does any good or not and in what situation, but there’s never any reason not to practice this, so always use a secure connection when its available, be it for web, ftp, ssh, imap, pop3, smtp, or anything else.

#7: Site structure/design and miscellaneous: Never use a browser-side check for security or anything that can impose a security issue on your site. Always do all checks server-side with a script (using a PHP, CGI script, rewrite rules, .htaccess env var checks, allow/deny rules, SSI tags calling a script or environment variable, etc.) Be sure that any data you allow users to add or submit to your site or database is properly checked. Do simple checking to ensure there can’t be any injections into the database, any data added to the site or a file that could be executed (such as someone adding an SSI tag or some type of PHP scripting within the site, or any JavaScript or anything that could cause other views, or you as the admin, to have your information obtained or crash your browser. Things such as stealing cookies or guessing sessions, etc. via an exploitable weakness or some type of cross site scripting attack). Also, keeping in mind the files you have, any loose files, old files, and their contents, can sometimes be a problem and provide information malicious users can use against you, especially if they are executable scripts, database dumps stored in the web root and things where files are renamed as copies or backups, such as mysqdata.sql or myscript.php.back (it will no longer be executable and people can view the file as a text file, for example). Also, other things such as upload scripts, where people can bypass the file extensions you try and limit it to, or upload a non image file (script) and execute it by uploading an .htaccess file to make all files executable, and image.gif is now a script or compiled binary they can run.

Other things to consider, such as your robots.txt file, and what directories you disallow search engines to access. Anyone can view that file (though you could disallow it for non valid search spiders, someone can fake the user agent), and that person could then see all of the directories or files you are hiding from search engine indexing. That is not an appropriate method to hide anything, and in fact, if you don’t have any links to that area that are able to be indexed (followed) or seen, you shouldn’t have to worry about a search engine indexing it anyway (it doesn’t know what files you have on your site, unless there’s a link in the public area or page for it to follow, like any browser would — and legitimate spiders don’t guess at file names). Also, if you suspect anything, watch the error and access logs for your site (but don’t get too worried when you see the very common exploit attempt request by exploit and spam bots that randomly go across the Internet trying every IP and domain for attempted, generic and usually very old exploits).

Conclusion and further options: These are just a few very simple and common things, which, if followed, should greatly reduce the chances of an account/script being exploited or defaced. It ultimately comes down to how secure the script is that you decide to run. For further protection of your site and reducing the chances of an exploitable script you might run on your account from causing damage to your files, please feel free to contact us, as we offer a very unique file locking service GUI for our users, which allows you great control over the files on your site. Stay tuned for an announcement about this service in the next few days.

Web Hosts that don’t want their servers and clients to be secure?

Saturday, October 4th, 2008 by Tim Greer
del.icio.us Digg Facebook FeedMeLinks Furl Ma.gnolia NewsVine Netscape Reddit Slashdot SphereIt SpurlStumbleUpon Technorati YahooMyWeb

Yes, it’s true. Not only are some hosts ignorant and unskilled in this area, but it seems that some are unwilling to take steps to improve security of their servers or their client’s scripts. Believe it. Worse, is some will even disable security measures, and for no good reason. I don’t mean for the sake of client friendly/open to any feature/service on the server without restrictions to “annoy” clients by giving up security policies or being lax, but actually where a poster on a usenet newsgroup was reporting a problem where when they tried to use Taint mode in their CGI script coded in Perl, that it would error. We all thought they must be mistaken, this would have to be an intentional action and a specialized build, and for what reason?

Just to remove a specific security feature to help clients code and use more secure scripts? This isn’t even forced on anyone, it’s simply an option in Perl to watch and warn or error in case of really bad, really insecure functions/practices. Yet, this host insisted they didn’t offer it and wouldn’t, and that they had disabled it, as it was their policy. This made everyone reading the thread take a double take and just ask why. Questions pondered with no rightful or sensible answers. It’s unimaginable that a web hosting provider, whom is supposed to specialize in this field, would intentionally remove a security feature that doesn’t whatsoever hinder the client’s experience or options, and is just intended to help them code more secure scripts, but only if they want to use it by specifically adding the switch -T in their shebang line (or calling a module). This is crazy, but it’s true.

Some hosts are more than just lazy or unwilling, but some apparently make actual efforts to make their systems and clients less secure. I’m still left pondering this. The only conclusion is that perhaps they are running mod_perl’s CGI emulation and since the Apache process CGI emulation has the global environment, that by the time they called it in their own script, it would warn that it was “too late” to use it and get an error to that effect. Still, it sounds like that’s not the case. I don’t usually encourage people to seek a different hosting provider, but in this case, that provider can only be bad news. I don’t get why people wouldn’t care about security, but to go as far as to disable good, default security options in the Perl interpreter, I’ll never understand that and find it even more troubling.

Watch those examples.

Saturday, October 4th, 2008 by Tim Greer
del.icio.us Digg Facebook FeedMeLinks Furl Ma.gnolia NewsVine Netscape Reddit Slashdot SphereIt SpurlStumbleUpon Technorati YahooMyWeb

In efforts to help those online with questions and problems, sometimes the answers are more dire than the question or problem the original poster (or “OP”) had to start with, if the OP acts on them without recognizing a problem with the example. It’s always best to double check and re-read your suggestion or example and not post it at all if you lack the time to double check what you are offering in a response, and to mention the potential of problems it could cause if your untested example was used without verifying it’s actually what the OP needs… and to make it clear that it is untested so the OP doesn’t attempt to use your example. Moreover, your tested code or example instructions might not work flawlessly for their specific issue anyway, even if you think you fully understand their question or problem.

A good example that sticks out, is a recent web hosting forum thread had someone explain they wanted to replace any opening PHP tags in any .php and .tpl file that were currently “<?”, and to replace it with “<?php” instead. There are valid reasons for this, such as <? can be confused as a PHP opening tag, if it’s an XML tag, so while “short tags” in PHP are usually fine, it is a valid reason to want to always ensure you use <?php instead of <?, as it’ll be correct across more systems and reduce the chances of problems and conflicts later (and it’s more portable this way).

People offered various examples, where one person used a regular expression with a Perl command in shell, which is a common and powerful way to accomplish such things. It was to read in any *\.{php,tpl}$ files, find and replace the value and, before modifying the current file, create a backup of the pre-modified file as $filename.bak. This is always a good idea, as well as having a good, reliable backup of the data before modifying it (so you don’t have to rename files, perhaps selectively, on a live environment site). I was just browsing over the technical forum to see what I could help with, as I had some extra time that day, and saw the post with a suggested solution, and I saw the OP state they ran it and all was fine. I almost missed it, and perhaps I was too late in seeing this, so I posted a follow-up to that thread urging them not to run the example and explained the problem.

The person offering the suggestion was well intended and obviously knew what they were doing, but simply didn’t catch the typo in their example. We all do this at some point, and they were appreciative I caught it, as I would be if someone caught an example I gave that was confusing or could pose a problem for the viewers of a thread (more on that later). The example given would, instead of replacing any instances of “<? ” with “<?php “, actually replace any single instance of white space (” “) with <?php. The example used the regular expression of s/<?\s/<?php/g which globally s/// replaces “<?” (the left side of the operation (less than, question mark, and a space)) with “<?php ” (with the right side of the operation -> <?php (less than, question mark, php, and a space),) or that was the intention anyway.

However, this had an unintended and unexpected result; The problem was that the right side of the operation <?\s (\s = a space character) was that ? is actually a character used in regular expressions to be used to state/mark the preceding character, class or string, for example, is “optional”. Therefore it was asked to match \s (a white space) with an “optional” leading “<” character, so “<? ” OR ” ” would be replaced with “<?php “. Thus, the phrase “Hello, world, how are you?” in a file would be replaced with “Hello,<?php world,<?php how<?php are<?php you?” with no opening (short) PHP tag (<?) in sight to execute the replacement. The file would be filled with broken, irrelevant and unwanted PHP open tags, adding it everywhere it saw a space in the file. The solution was to make the ? character to be seen as ?, meaning it needed to be escaped (backwacked) to disable the “optional” meaning and actually mean just ?. This is done by adding a backslash directly before the character (\?).

Therefore, a simple oversight of the difference between<?\s and <\?\s could cause dramatic and dire results. The poster that offered the solution, again, knew what he was doing and immediately recognized the mistake, but this goes to show that one single small mistake in an example or suggestion, command line, fix or instruction can cause unexpected and awful results. You should therefore always double check your example, even if it is just an example. This is a well versed suggestion in usenet newsgroups, such as comp.lang.perl.misc, as seasoned programming instructors that participate there will remind you, even on examples, you should always check for success or failures, when applicable, such as if a file was successfully opened before trying to iterate through each line of the file handler, because the OP will perhaps not realize this, try running your example and have all sorts of unexpected results and problems. It’s best to create some simple and safe environment to test your example chunk of code, perhaps in a controlled manner, before offering a suggestion, or otherwise to clearly state it is untested and potentially dangerous (if it could be) for them to run.

This goes back to a similar experience I had in the Perl newsgroup recently, where the OP asked what the meaning of m@ was in their example of $string =~ m@@si; I had explained that this was a regex to match (m//) and that @ was just a delimiter and that “m(anything) is making (anything) the delimiter. m// is just most common to see, but m,, or m!! and so on, are perfectly acceptable. It can help save typing over using m// when you have a lot of / characters you’d otherwise have to backwack (escape). (i.e., m!<b>this</b><i>that</i>!si saved typing and makes it more readable than m@<title>\s*(.*?)\s*<\/title><b>this<\/b><i>that<\/i>@si, especially if you have a lot of characters within the regex that are also your delimiters)”.

Well, someone (Ilya) called me on it, and rightfully so, stating that, what I said would be correct, except m(anything) is using `(’ and `)’ as delimiters. This is true, I used a too literal example of “(anything)”, where “(anything)” wasn’t intended to say an entire word or a word within parenthesis would be a valid open and closing delimiter in a regular expression. Good point, even though it’s probably obvious, what if the OP didn’t know (they probably didn’t, since they had to ask originally anyway), and my response could have caused them a lot of confusion.

In fact, the OP actually replied to my example and said ” And what would be the reason for using ( and ) as delimiters in this case? “, and I had found I was explaining my example, and it clearly caused them some confusion, not to mention that certain other characters can’t just be used as a delimiter and could (or not) based on a few other variables (not literal $variables, of course), so you can’t really use anything, and especially “(anything)”.

I suppose this is obvious, but most of us are known to take it for granted; Clarification and a little time to consider the OP might not understand and take you too literally, or run a piece of code or a command, or follow some instruction you didn’t mean as a complete and perfect solution for them, could spell disaster or result in lot of confusion and follow up questions (and maybe more of your time, when it’s less of an opportune time for you to explain, except you are now responsible to, since you caused the confusion or problem). Therefore, taking a few seconds or minutes more, double checking for typos and considering that the OP might copy and paste and run whatever you tell them, dictates you make some type of effort to clarify, re-read what you wrote so it’s not taken wrong to too literally, and hopefully test it, if applicable, before you hit that submit/send button.

Perl Programming and regular expression fun.

Saturday, October 4th, 2008 by Tim Greer
del.icio.us Digg Facebook FeedMeLinks Furl Ma.gnolia NewsVine Netscape Reddit Slashdot SphereIt SpurlStumbleUpon Technorati YahooMyWeb

Recently, in the Perl Programming language usenet newsgroup (comp.lang.perl.misc), a poster posed a question, asking if a certain task was possible, thinking it wasn’t. I read it and immediately thought “Sure, that’s possible, it’s easy…” and went on to post a solution a short time later. However, I had to think about it for a minute, because I never considered doing something like this. To be more accurate, have done things just like this, but wouldn’t have approached it this way. Of course, there’s more than one way to skin a cat and some ways are far more creative, fun, challenging or inventive than others.

The question was an interesting challenge, because the poster wanted to only use a regular expression to dynamically detect and replace any duplicate instances of any single character in a string to only keep the first instance and remove the rest following. Most people assumed that the duplicate characters in the string would be predetermined/already known, but they weren’t (e.g.; it could be any single character repeated in the string, or multiple characters repeated throughout, not just one character you already knew about that could be used more than once).

They had posed the following string examples:

a”bc’def’g’ -> a’bcdefg
”’ab’cd’efg -> ‘abcdefg
abc’d'e”f’g -> abc’defg

My posted solution was a simple and effective one line of relevant code:

$string =~ s|(.)| ($` =~ m/$1/) ? ” : $1 |eg;

This takes any single character, captures it and then processes it by counting how many times it finds it, and if it’s over once, it ignores duplicates of it, and the replacement (after the first replacement) is empty, rather than the captured value. =~ s|…|…|g; replaces globally for each (.) single character it finds and captures into $1, while /e processes the right side of the regex and uses $` to check the number of times $1 is matched by checking with ($` =~ m/$1/), which results in a number due to the parenthesis and the operators are assigned to replace with “” (nothing) “” if it’s true ? or else (:) $1 if it’s false (less than one, hasn’t shown a match yet for that single character).

I created the following script to example this (pardon the fact I didn’t put it in a loop to show the varied strings and parsed output):

#!/usr/bin/perl
use warnings;
use strict;

my $linea = “a”bc’def’g'”;
my $lineb = “”’ab’cd’efg”;
my $linec = “abc’d'e”f’g”;

print “$linea -> “;
$linea =~ s|(.)| ($` =~ m/$1/) ? ” : $1 |eg;
print “$linea\n”;

print “$lineb -> “;
$lineb =~ s|(.)| ($` =~ m/$1/) ? ” : $1 |eg;
print “$lineb\n”;

print “$linec -> “;
$linec =~ s|(.)| ($` =~ m/$1/) ? ” : $1 |eg;
print “$linec\n”;

The output:
~]$ ./script.pl
a”bc’def’g’ -> a’bcdefg
”’ab’cd’efg -> ‘abcdefg
abc’d'e”f’g -> abc’defg

Following this example, another poster (Ben Morrow) followed up with some other methods of accomplishing the same task, also using only regular expressions:

Without using /e:
~% perl -le’$_ = “abccbdcdc”; 1 while s/(.)(.*)\1/$1$2/g; print’
abcd

Using 5.10’s \K we can remove the replacement part:
~% perl5.10.0 -le’$_=”abccbdcdc”; 1 while s/(.).*\K\1//g; print’
abcd

and if we reverse the string before and after (so we can use look*ahead* instead, which can be variable-length) we can remove the while loop:

~% perl -le’$_ = reverse “abccbdcdc”; s/(.)(?=.*\1)//g;
print scalar reverse’
abcd

Now, how cool is that? I offered one method, and Ben offered three additional methods, totaling four various ways to accomplish the task using only regular expressions. Perl’s regular expressions and varied ways about coming up with solutions, truly makes it a great language. This is only one of many cool tricks and challenges people have posed that impresses me about Perl’s power, but this was just a recent one that stuck in my mind. Since I didn’t see any existing examples out there, this could be useful information to learn from and use if you should run into such a challenge some day.

There are a million reasons to use regular expressions, and if used properly, they are a very powerful, time saving and accurate feature of any language. The above is just one of many examples that illustrate the Tim Toady (TIMTOWTDI -> “There Is More Than One Way To Do It“) power of Perl. Doing that much logic in a very short, simple regular expression. One of about 10,000 things that show how cool Perl is. By far, my language of choice when coding.


Valid XHTML 1.0 Transitional