Modes

Modes

Recursive

Recursive mode is the default mode of rwalk. It starts from a given path and checks each of its subdirectories.

Let's say you want to scan the example.com website with the following structure:

example.com
β”œβ”€β”€ /about
β”‚   β”œβ”€β”€ /team
β”‚   β”‚   └── /member1
β”‚   └── /contact
└── /products
    β”œβ”€β”€ /product1
    └── /product2

Recursive mode will start from example.com and check:

  • example.com/about
  • example.com/about/team
  • example.com/about/team/member1
  • example.com/about/contact
  • example.com/products
  • example.com/products/product1
  • example.com/products/product2
  • ...

Directory detection

rwalk will only recurse into directories. If a file is found, it will be ignored. It determines if a path is a directory with a simple algorithm:

if content is html:
    if content contains some of default html directories:
        return True
else if response is redirection and has a location header:
    if location header is equal to f"{url}/":
        return True
    else:
        return False
else if status is in 200-299 or 401-403:
    return True
else:
    return False

If this algorithm is not enough for your use case, you can implement your own directory detection function in the rhai (opens in a new tab) scripting language. See Scripting for more information.

Classic

Classic mode allows for template-based fuzzing. You provide a list of patterns to check, and rwalk will replace each pattern with the words from the wordlists.

Each wordlist is identified by an optional key, which is used to reference it in the patterns.

πŸ’‘
The default key is $

For example, let's say you have two wordlists: version.txt and endpoints.txt. You want to find all paths that follow the pattern /api/{version}/{endpoint}. This can be achieved with the following command:

rwalk example.com/api/V/E -w version.txt:V -w endpoints.txt:E

version.txt is associated with the key V, and endpoints.txt is associated with the key E.

Tip: You could generate the version.txt directly with the following command:

seq 1 10 | xargs -I{} echo v{} | rwalk example.com/api/$/E endpoints.txt:E -

Using - as the wordlist path will make rwalk read from the standard input.

These keys can also be used to identify the wordlists in the options. If you want to apply some filtering to only one of the wordlists, you can use the key to reference it.

rwalk example.com/api/V/E -w version.txt:V -w endpoints.txt:E --wf "[E]length:>5"

Here, we are using --wf (short for --wordlist-filter) to only keep the endpoints with at least 5 characters.

Spider

Spider mode, aka crawling mode, starts from a given path and follows all links found until a certain depth. This is particularly useful for recon tasks to find all associated endpoints of a target.

This is also the only mode that needs to be specified explicitly as it is impossible to detect automatically.

For example, to crawl the cstef.dev website with a depth of 4:

rwalk cstef.dev --mode spider --depth 4 --subdomains

The --subdomains flag will also include subdomains in the crawl.

βœ“ 200 / (dir)
β”œβ”€ πŸ” cstef.dev
β”‚  β”œβ”€ βœ“ 200 / (dir)
β”‚  β”œβ”€ βœ“ 200 /android-chrome-512x512.png (image/png)
β”‚  β”œβ”€ βœ“ 200 /favicon.ico (image/vnd.microsoft.icon)
β”‚  └─ βœ“ 200 /assets (application/javascript)
β”‚     β”œβ”€ βœ“ 200 /index-d18fbe59.js (application/javascript)
β”‚     └─ βœ“ 200 /index-81baf222.css (text/css)
β”œβ”€ πŸ” blog.cstef.dev
β”‚  β”œβ”€ βœ“ 200 / (dir)
β”‚  β”œβ”€ βœ“ 200 /posts (text/html)
β”‚  β”‚  β”œβ”€ βœ“ 200 /solving-the-traefik-puzzle (text/html)
β”‚  β”‚  └─ βœ“ 200 /web-scanning-efficiently (text/html)
β”‚  └─ βœ“ 200 /_next/static (application/javascript)
└─ πŸ” ctf.cstef.dev
   └─ βœ“ 200 /api/login (text/html)