regex query to change US dates into European formatting? دھاگا پوسٹ کرنے والے: Jan Sundström
| Jan Sundström سویڈن Local time: 00:49 رکن (1970) سویڈشسےانگریزی + ...
Hi all,
I know you guys (Vito, Jerzy, Alexej, Samuel) are brilliant with advanced regex searches, so I'm shamelessly relying on your help.
I have a bunch of html files containing dates in US formatting:
...text... 9/11/2001 ...text....
...text... 12/31/2006 ...text...
I'm trying to apply Vito's me... See more Hi all,
I know you guys (Vito, Jerzy, Alexej, Samuel) are brilliant with advanced regex searches, so I'm shamelessly relying on your help.
I have a bunch of html files containing dates in US formatting:
...text... 9/11/2001 ...text....
...text... 12/31/2006 ...text...
I'm trying to apply Vito's method, posted here:
http://www.proz.com/post/540222#540222
In plain text, it would be something like:
Search for nn[1]/nn[2]/nnnn[3]
Replace with [3]/[1]/[2], where the numbers are placeholders, calling the parameters in the order they were found in the search.
I'm just trying to figure out how to write the query, this is where I need your help.
Thanks a lot in advance for your addistance. I'll promise to reward it somehow!
/Jan ▲ Collapse | | | Here it goes... | May 24, 2007 |
I have not been called upon, yet I will try nevertheless
I assume you want it done in Word? (Other editors have somewhat different flavor of regexp).
It should go like this:
Find:
([0-9]@)/([0-9]@)/([0-9]@)>
Replace:
\3/\1/\2
I have checked it with several expressions, but let me know how it works... | | | Robert Tucker (X) برطانیہ Local time: 23:49 انگریزیسےجرمن + ...
Using Perl, I think it is:
perl -pi -e 's/([0-9]*)\/([0-9]*)\/([0-9]*)/\3\/\1\/\2/g' filename.txt | | | Jan Sundström سویڈن Local time: 00:49 رکن (1970) سویڈشسےانگریزی + ... TOPIC STARTER Text editor suggestions?! | May 24, 2007 |
Hey Jabberwock, Robert,
You guys are great!
Perl is very powerful, but I'm afraid that I've hardly every used it, so it's a non-starter until I learn more.
For this time, I'd hope do it in a text editor. Any suggestions?
I've been using TextPad out of old habit, but I'm getting fed up with the lack of Unicode support (copyright marks and other extended chars show up as black boxes).
Would any text editor use the same expressions like Word... See more Hey Jabberwock, Robert,
You guys are great!
Perl is very powerful, but I'm afraid that I've hardly every used it, so it's a non-starter until I learn more.
For this time, I'd hope do it in a text editor. Any suggestions?
I've been using TextPad out of old habit, but I'm getting fed up with the lack of Unicode support (copyright marks and other extended chars show up as black boxes).
Would any text editor use the same expressions like Word? Any suggestions?
/Jan ▲ Collapse | |
|
|
Haven't found one... | May 24, 2007 |
I haven't found a text editor which would suit exactly my tastes and my needs... I am trying out one after another, but still missing something...
For now I have settled for PSPad, but I am not sure I would recommend it to someone. The interface is somewhat muddled and the search feature is somewhat inconvenient... The codepage changes, on the other hand, are a breeze!
For PSPad the relevant expressions are:
([0-9]+)/([0-9]+)/([0-9]+)
and: ... See more I haven't found a text editor which would suit exactly my tastes and my needs... I am trying out one after another, but still missing something...
For now I have settled for PSPad, but I am not sure I would recommend it to someone. The interface is somewhat muddled and the search feature is somewhat inconvenient... The codepage changes, on the other hand, are a breeze!
For PSPad the relevant expressions are:
([0-9]+)/([0-9]+)/([0-9]+)
and:
$3/$1/$2
For TextPad:
\([0-9]+\)/\([0-9]+\)/\([0-9]+\)
and:
\3/\1/\2
With other editors there might be a problem with greediness (so that the search catches only the first digit of the date). Also, be careful whether the replace function gets you to the next expression, as naturally the newly-replaced expression will also be caught by the search. You can go around it by specifying the number of digits, e.g.
([0-9]{1,2})/([0-9]{1,2})/([0-9]{4})
but I do not think this is really necessary... ▲ Collapse | | | Vito Smolej جرمنی Local time: 00:49 رکن (2004) سلوویائیسےانگریزی + ... SITE LOCALIZER few hints... | May 24, 2007 |
Hi Jan:
Note that slash has its own meta meaning in Word regex, so what I do in such cases, borders on obscene; I replace it in the subject contexts with ... er, hm ... §? (for instance - some character you will not miss, or worse hit unintentionally). Then it's easier to write the search and replace entries....
The Tortoise Tagger is a good help for this kind of stuff. But the subject by itself is hard - the documentation however is good -. In essence you write a ... See more Hi Jan:
Note that slash has its own meta meaning in Word regex, so what I do in such cases, borders on obscene; I replace it in the subject contexts with ... er, hm ... §? (for instance - some character you will not miss, or worse hit unintentionally). Then it's easier to write the search and replace entries....
The Tortoise Tagger is a good help for this kind of stuff. But the subject by itself is hard - the documentation however is good -. In essence you write a batch file to do all this "first global search and replace / by $, then hide anything but ..." etc etc
PS: awk IS supposed to do this kind of transliterations - but try to do that on a word file...
[Edited at 2007-05-24 20:32] ▲ Collapse | | | Robert Tucker (X) برطانیہ Local time: 23:49 انگریزیسےجرمن + ... Text editors v. command line | May 24, 2007 |
The thing about using a command line editor like Perl is that it can do the Search and Replace on all the files in a folder in one go.
If you were to change directory to the folder containing all the html files you and just issue:
perl -pi -e 's/([0-9]*)\/([0-9]*)\/([0-9]*)/\3\/\1\/\2/g' *.html
they would all be done.
If you use a text editor, I suspect you are going to need to open all the html files individually, edit them and then close them... See more The thing about using a command line editor like Perl is that it can do the Search and Replace on all the files in a folder in one go.
If you were to change directory to the folder containing all the html files you and just issue:
perl -pi -e 's/([0-9]*)\/([0-9]*)\/([0-9]*)/\3\/\1\/\2/g' *.html
they would all be done.
If you use a text editor, I suspect you are going to need to open all the html files individually, edit them and then close them again unless you can write some batch script. Also text editors tend to be slower than command line ones doing a single file – though you might need a file of a few hundred lines to notice it much.
That said (and having just looked how "grep" used much on Unix/Linux systems can be used on Windows) I found Windows Grep which you might like to look at.
Essentially though, you might find it quicker in the long run to look into command line than to try to do it all with a text editor.
Regarding awk, I'm not sure it does backreferences:
Grouping support is present in Perl together with various backreference mechanisms. Grouping is also supported by all awk variants, but backreferences are not. GNU grep, egrep and sed support both grouping and backreferences.
http://snow.nl/dist/htmlc/ch13.html ▲ Collapse | | | justin C امریکہ Local time: 18:49 انگریزی
|
|
Jan Sundström سویڈن Local time: 00:49 رکن (1970) سویڈشسےانگریزی + ... TOPIC STARTER Thanks a million! | May 25, 2007 |
Jabberwock wrote:
For TextPad:
\([0-9]+\)/\([0-9]+\)/\([0-9]+\)
and:
\3/\1/\2
For this time I chose to go with Jabberwock's solution, because I'm familiar with the interface and search dialogue of Textpad. It works like a charm!
Textpad sort of handles batch processing (taking care of all open files), but AFAIK you can't let it work on a path or directory structure.
At my previous workplace, we had a perl guru who could do anything and everything thru the command line, and vim was of course also used.
So for future jobs, I'll definitely look into the other options mentioned.
I owe all of you a big one!!!
/Jan | | | One more tool... | May 25, 2007 |
Glad it works!
The mention of Windows Grep (which costs money) made me remember another tool, which is freeware:
http://www.orbit.org/replace/
I think it is a nice intermediate step between editor search and advanced S&R languages, as it is both quite capable and yet easy to use.
Of course, if you can't live a day without a search and replace, then you mig... See more Glad it works!
The mention of Windows Grep (which costs money) made me remember another tool, which is freeware:
http://www.orbit.org/replace/
I think it is a nice intermediate step between editor search and advanced S&R languages, as it is both quite capable and yet easy to use.
Of course, if you can't live a day without a search and replace, then you might need much bigger tool: TextPipe. Unfortunately, this costs a lot of money, so make sure first you really need it... ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » regex query to change US dates into European formatting? CafeTran Espresso |
---|
You've never met a CAT tool this clever!
Translate faster & easier, using a sophisticated CAT tool built by a translator / developer.
Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools.
Download and start using CafeTran Espresso -- for free
Buy now! » |
| Trados Business Manager Lite |
---|
Create customer quotes and invoices from within Trados Studio
Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |