Showing results for 
Search instead for 
Did you mean: 

Web Site Mapper to Create MWG Lists of URLs?

I realize it's not a great idea or the best/easiest way to do this but here goes anyway... We have a need to allow access to only one particular website and all of it's associated "pieces". I realize a single site is made up of pieces from all over the place but does anyone know of a good utility (either standalone or online) that can "crawl" a website and provide a list of all of the URLs/hosts/domains that are required to "build" that site?

0 Kudos
3 Replies
Level 9

Re: Web Site Mapper to Create MWG Lists of URLs?

Not build for that it should give you some good info - URL Scanner

0 Kudos
Level 10

Re: Web Site Mapper to Create MWG Lists of URLs?

When enabling a web site, I will usually dig out the minimum required URL's incrementally, with a combination of rule traces and the browsers developer tools (F12).

If wanted to crawl a site, I would probably just run wget; but last I checked, that won't get you anything linked by JavaScript.

0 Kudos
Level 12

Re: Web Site Mapper to Create MWG Lists of URLs?

One possible methodology would be to try to leverage a Referer value.

For example, something like URL.Host portion of Header.Request.Get("Referer") equals "base website name" then permit the connection.

You'd have to manipulate the referer value to get the base URL.Host value from the referer. You could do that with a regex to pull out the hostname from the complete referer value:

MWG format: regex(^https?://(.*?)/(.*),"\1")

So then you could do something like String.ReplaceAllMatches(Header.Request.Get("Referer"),regex(^https?://(.*?)/(.*),"\1") is in list Base Websites.

For example, if you put in the "base websites" list you're comparing against and you have the refer value of[..., access to could be permitted based on the fact that the regex will return with the syntax referenced above.

0 Kudos