Regular Expression Cheat Sheet

StarAgilecalenderLast updated on March 10, 2023book15 minseyes3863

A specialised characteristic pattern which describes a certain amount of text is referred to as Regex or regexp (Regular expressions). When these patterns are used from any source it is called Regex Cheat sheet or regular expressions cheat sheet. In natural processing and various other tasks like common text data manipulation 

More on regular expressions cheat sheet

There are several regex engines that can be used to process regexes. Regex engines act differently depending on the syntax they use, and you can get a list of most popular engines in this blog. Two of the common languages include Python and R, which have their own engines. 

A regex or regular expression cheat sheet can be used to extract substrings from lower strings, check for patterns in a text, and modify text since it describes patterns of text. There are many different types of regex, including simple regex which describes specific words, and more complex regex which finds vague patterns of characters, such as the top-level domain at the end of a URL.

Definitions

  • Literal Character - You can use literal characters in regular expressions to represent actual characters. For example, if you want to represent an ”r”, then you would write “r”.
  • Metacharacter - A metacharacter is a character that tells the regex engine that the following character has a unique meaning(usually after /). Metacharacters can serve as markers for beginning, ending or matching single characters.
  • Character Class -  In a character class(or character set ), the engine is instructed exactly what characters it should look for. It is denoted by [ and ], with the characters you wish to find between the brackets.
  • Capture Group - It allows you to group regexes together using round parenthesis (see below) and apply other regex features to the group such as quantifiers.

Data Science

Certification Course

100% Placement Guarantee

View course

Regular expressions cheat sheets are a powerful tool for manipulating text and searching for patterns. Here's a cheat sheet of some of the most commonly used syntax and regex Python:

1. Anchors

These match a position just before or after any other character:

SyntaxDescriptionExample PatternExample MatchesExample non- matches
^, \AMatches the start of line^r, \Ar

run

recess 

dog

cat

$, \ZMatch the end of linen$, n\Z

corn

fun

mobile

speaker

\bMatch the characters at the start and end of a word\brat\b

the rat ran

the rat ate

ratskin 

ratflow

\BMatch characters in between of a word\Boo \B

book

shook

shoe

flue

 

 

 

2. Matching types of character

The more specific type of character tp match is not just the character itself, but the type of character, such as letter, number, and more.

SyntaxDescriptionExample PatternExample MatchesExample non- matches
.Anything except for a line breakc.e

cheap

clean

 

Acert

cent

\dmatch a digit\d

6060-896

2b|^2b

Tw 

**___

\DMatch a non-digit\DThe 5 cats ate 12 angry rats

52

10032

\w

 

Match word characters\wee\w

Trees

bees

The bee

Eels eat skin

\WMatch non-word characters\Wbat\W

At bat

Swing the bat slow

Wombat

bat53

\sMatch whitespace\sfox\s

The fox ate

The fox ran

It’s the fox

fox-fur

\SMatch non-whitespace\See\S

Trees

beef

The bee stung

The tall tree

\metacharacterEscape a metacharacter to match on the metacharacter

\.

\^

The cat ate.

2^3

The cat ate 

23

 

3.  Character classes

These are sets or ranges of characters

SyntaxDescriptionExample PatternExample MatchesExample non-matches
[xy]Match several charactersgr[ea]y

Gray

grey

Green

greek

[x-y]Match a range of characters[a-e]

Amber

brand

Fox 

join

[^xy]Does not match several charactersgr[^ea]y

Green

greek

Gray

grey

[\ ^-]Match metacharacters inside the character class4[\^\.-+*/]\d

4^3

4.2

44

23

 

4.  Repetition

The repeated appearance of characters can be matched rather than a single instance of them.

SyntaxDescriptionExample PatternExample MatchesExample non- matches
x*Matches zero or more timesar*o

Cacao

carrot

Arugula

artichoke

x+Matches one or more timesre+

Green

tree

Trap

ruined

x?Matches zero or one timesro?a

roast

rant

Root

rear

x{m}Match m times\we{2}\w

deer

seer

Red

enter

x{m,}Match m or more times2{3,}4

671-2224

22224

224

123

x{m,n}Match between m and n times12{1,3}3

1234

1222389

15335

122223

x*?,x+?, etcmatch the minimum number of times - known as a lazy quantifierre+?

Tree

free

Trout

roasted

 

5. Capturing, alternation & backreferences

Using the capture function, you can identify the parts of the string that you want to extract.

SyntaxDescriptionExample PatternExample MatchesExample non-matches
(x)Capturing a pattern(iss)+

Mississippi

missed

Mist 

persist

(?:x)Create a group without capturing(?:ab)(cd)

Match: abcd

Group 1: cd

acbd
(?<name>X)Create a named capture group(?<first>\d)(?<scrond>\d)\d*

Match: 1325

first: 1

second: 3

2

hello

(x|y)Match several alternative patterns(re|ba)

red

banter

Rant 

bear

\nreference previous captures where n is the group index starting at 1(b)(\w*)\1

blob

bribe

Bear

bring

\k<name>Reference named captures(?<first>5)(\d*)\k<first>

51245

55

523

51

 

6. Lookahead

Characters can be specified before or after matching, without those characters appearing in the match.

SyntaxDescriptionExample PatternExample MatchesExample non-matches
(?=x)looks ahead at the next characters without using them in the match

an(?=an)

iss(?=ipp)

banana

Mississippi

band

missed

(?!x)looks ahead at next characters to not match onai(?!n)

fail

brail

faint

train

(?<=x)looks at previous characters for a match without using those in the match(?<=tr)a

trail

translate

bear

streak

(?<!x)looks at previous characters to not match on(?!tr)a

bear

translate

trail

strained

 

7. Literal matches and modifiers

Changing the matching rules using modifiers changes how they work.

Syntax

Description

Example PatternExample Matches

Example non- mathes

\Qx\Ematch start to finish

\Qtell\E

\Q\d\E

tell

\d

I’ll tell you this

I have 5 coins

(?i)x(?-i)set the regex string to case-insensitive(?i)te(?-i)

sTep

tEach

Trench

bear

(?x)x(?-x)regex ignores whitespace(?x)t a p(?-x)

tap

tapdance

c a t

rot a potato

(?s)x(?-s)turns on single-line/DOTALL mode which makes the “.” include new-line symbols (\n) in addition to everything else(?s)first and second(?-s) and third

first and

 

Second and third

first and

second 

and third

(?m)x(?-m)Changes ^ and $ to be end of line rather than end of string^eat and sleep$

eat and sleep

eat and sleep

treat and sleep

eat and sleep. 

 

8. Unicode

Chinese characters and emojis can be used with regular expressions beyond the Roman alphabet.

  • Code points - In a system such as Unicode, an abstract character is represented by a hexadecimal number.
  • Graphemes - The alphabet consists of a series of graphemes, which are either code points or characters.
SyntaxDescriptionExample PatternExample MatchesExample non-matches
\Xmatch graphemes\u0000gmail@gmail www.email@gmail

gmail

@aol

\X\XMatch special characters like ones with an accent\u00e8 or \u0065\u0300èe

 

Regular expressions cheat sheets can be complex and powerful, but also difficult to master. This regex cheat sheet covers some of the most commonly used patterns and syntax, but there are many more possibilities and combinations.

Data Science

Certification Course

Pay After Placement Program

View course

Conclusion

In conclusion, a regex cheatsheet is a quick reference guide that provides a comprehensive list of regular expressions and their corresponding meanings. It is a helpful tool for developers, data analysts, and anyone who works with text data and needs to extract, search or replace specific patterns.

A regex cheat sheet typically includes a variety of regular expression syntaxes, such as anchors, quantifiers, character classes, groups, and assertions. It also covers special characters and metacharacters that have specific meanings within regular expressions, such as the dot (.), caret (^), dollar sign ($), and backslash ().

Using a regex cheatsheet can save time and increase productivity by allowing users to quickly and easily identify the appropriate regular expression pattern for a particular task. However, it is important to keep in mind that regular expressions can be complex and may require additional practice and experimentation to master.’

 If you want a hassle-free experience in your career and want to reap all the benefits of the programming language or data science collectively then StarAgile provides a course for Data Science. Data Science is a growing field in the current era. If you complete the training provided by us, you will not only get Data Science Certification but it will also develop your skills to the next level. 

Crafting the Perfect Data Scientist Resume For 2024

Last updated on
calender06 Dec 2023calender10 mins

Data Science Roadmap

Last updated on
calender06 Dec 2023calender20 mins

Top Data Science Science Interview Questions & Answers

Last updated on
calender05 Jan 2024calender15 mins

How to Start Career in Data Science: Top 5 Tips

Last updated on
calender06 Dec 2023calender15 mins

What is Data Analysis: Everything You Need To Know About

Last updated on
calender09 Jan 2024calender15 mins

Keep reading about

Card image cap
Data Science
reviews3278
What Does a Data Scientist Do?
calender04 Jan 2022calender15 mins
Card image cap
Data Science
reviews3193
A Brief Introduction on Data Structure an...
calender06 Jan 2022calender18 mins
Card image cap
Data Science
reviews3000
Data Visualization in R
calender09 Jan 2022calender14 mins

Find Data Science Course in India cities

We have
successfully served:

3,00,000+

professionals trained

25+

countries

100%

sucess rate

3,500+

>4.5 ratings in Google

Drop a Query

Name
Email Id
Contact Number
City
Enquiry for*
Enter Your Query*