Contacts

Blog Blog
Twitter Twitter

Programs

PAD Files

Online Tools

Proxylists

Feedback
 Write...

RegExp Extractor

RegExp Extractor is an utility designed to extract various data from text files and logs using conditions and rules written using regular expressions.
It is very fast and can process huge files.

Download RegExp Extractor

RegExp Extractor 1.9 (build 72, 2012-05-25)
OS: Windows NT, 2000, XP, Vista, 7

Buy RegExp Extractor now

Price: $25

Buy now

We accept credit cards, PayPal, wire transfers and other payment methods. We use SWREG to handle online transactions.

  RegExp Extractor

Howto Use

To use this program you need to know regular expressions (regexp).

Source file(s) - file that you want to extract data from. You can use mask here, ex. c:\temp\*.txt

Output file - output file for the extracted data.

Output dir - RegExp Extractor can produce several output files. This option allows to define destination folder for them.

Save other lines to file - save the lines, that don't match any regular expression, here.

Conditions/Rules Tabs

Each tab contains set of conditions and rules to extract data.
For ex., "emails" tab contains conditions and rules to extract emails, "url-domains" tab contains conditions and rules to extract domains from the urls.

To add a new tab use [+] button below the tabs, to remove an existing tab use [-] button.

When you press "Start" button, RegExp Extractor will extract data from source file(s) using conditions and rules from the active tab.

Each set of conditions & rules has the Title (name of the tab).

Extract Conditions
Each line contains the regular expression with the name. Conditions are used in Extract Rules.

Extract Rules
Each line contains the rule - what data to extract.

Example 1.

Condition: email=/[a-z0-9][a-z0-9.-]+[a-z0-9]@[a-z0-9][a-z0-9.-]+[a-z0-9]/
Rule: email:$0

In this example: email is the name of the used condition.
/[a-z0-9][a-z0-9.-]+[a-z0-9]@[a-z0-9][a-z0-9.-]+[a-z0-9]/ is the regular expression to extract emails.
$0 specifies that we need to extract all sub-strings of the source line, that match the condition. For our example it is email.

Example 2.

Condition: url-domain=/https?://([a-z0-9][a-z0-9.-]+[a-z0-9])|(www\.[a-z0-9.-]+[a-z0-9])/i
Rule: url-domain:$1$2

url-domain is the name of the used condition.
/https?://([a-z0-9][a-z0-9.-]+[a-z0-9])|(www\.[a-z0-9.-]+[a-z0-9])/i is the regular expression to extract urls.
$1$2 specifies to extract the first ([a-z0-9][a-z0-9.-]+[a-z0-9]) and the second (www\.[a-z0-9.-]+[a-z0-9]) groups from sub-strings that match regular expression.

Also you can use another characters in the rules to produce result lines, for ex.: email:The email is $0
Result lines will look like this:

The email is email1@domain1.com
The email is email2@domain2.com

Separate by conditions

This option allows you to save lines that match different conditions into different files in the output folder.
See Example below.

Example 3.

Separate by conditions = On

Conditions

sent-ok=/sent ok/i
blocked=/blocked/i
http=/(https?://)|(www\.)[a-z0-9.-]+//
err=/(-ERR \[[0-9]{3}\] : ).+ : (.+)/

Rules

sent-ok!:$L
blocked!^err:$L
http!^err:$L
err!:$L

This example demonstrates how to save all lines that have sent ok sub-sting to sent-ok.txt,
lines that have blocked sub-string AND don't match the err condition to blocked.txt,
lines that have urls (that match http condition) AND don't match the err condition to http.txt,
lines that match err condition to err.txt.

Sign ! after the name of the condition in rule expression means that RegExp Extractor will stop processing rules if the line matches the condition from this rule. If we omit ! in our example then RegExp Extractor will save the line sent ok: blocked to the both files: sent-ok.txt and blocked.txt.

^ in blocked!^err means that the line should match the condition blocked and match the condition err.

Also you can use ~ sign that means that the line SHOULD NOT match the condition after that sign.
Example: blocked!~sent-ok:$L
$L means taht you need to extract WHOLE the line. Not only the sub-string that matches the regular expression.

Example 4.

Separate by conditions = Off

Conditions

http=/https?://([a-z0-9][a-z0-9.-]+[a-z0-9])|(www\.[a-z0-9.-]+[a-z0-9])/i

Rules

http>>$1$2.txt:$L

In this rule we specified the output file name $1$2.txt where $1$2 is the domain of the extracted url.
This example demonstrates how to separate lines by domain.
Opt-In List Manager
Email list management program. It is specially designed to provide an efficient way of processing huge email lists.
Web Proxy Checker
Free and fast proxy checking software. Supports SOCKS4/SOCKS5/HTTP/HTTPS proxies with and without authentication. Check for connect to host or load URL. Multithreaded. Handle redirects. Can download proxy list from the given URL. Built-in proxy distribution web server.
Web Searcher
Web scraping tool. Allows to search in Google and Bing for keywords and extract various data from web pages and sites.
Web Image Uploader
Web Image Uploader is a tool designed for easy and fast uploading images to an image hosting services.
Web URL Shortener
Web URL Shortener is a tool that allows to create short URLs that can be easily shared, tweeted, or emailed to friends.
RegExp Extractor
RegExp Extractor is an utility designed to extract various data from text files and logs using conditions and rules written using regular expressions.
Thumbnail Grabber
Free utility to create thumbnail screenshots of web pages in JPEG format.
Opt-In List Extractor
A simple but powerful utility to extract and combine multi column email lists.
RAS Dialer
Free dialer for Windows. Features: auto dial after start, re-dialing, minimization to tray.
Opt-In Mail
A small program to send e-mail with support of Yahoo! DomainKeys.
Opt-In Tunnel
Simple TCP port redirector. This tool accepts connections on a particular TCP port and creates a tunnel to the specified web- or mail-server.
WHOIS utility
Command line utility that performs whois lookup for domain name or IP address.