View previous topic :: View next topic |
Author |
Message |
heikole Tux's lil' helper
Joined: 04 Oct 2004 Posts: 148 Location: Berlin, Germany
|
Posted: Sat Sep 10, 2005 3:07 pm Post subject: Finding orphan pages in web site? |
|
|
Do we have an ebuild in Gentoo which makes it possible to find orphan pages in a web site? If not so, does anybody know an "external" tool?
Orphan pages should be defined here as pages that reside in a web site, but are not linked to from any other page.
THX
Heiko |
|
Back to top |
|
|
Taladar Guru
Joined: 09 Oct 2004 Posts: 458 Location: Bielefeld, Germany
|
Posted: Sat Sep 10, 2005 6:26 pm Post subject: |
|
|
Theoretically you should be able to mirror the website with wget. Then you can diff the mirror and the original to find files that were not downloaded (and thus not linked from anywhere). To make sure this works disable automatic directory index generation.
Of course I never tried this but I see no way for wget to download unlinked websites without an automatic directory index. |
|
Back to top |
|
|
heikole Tux's lil' helper
Joined: 04 Oct 2004 Posts: 148 Location: Berlin, Germany
|
Posted: Sun Sep 11, 2005 9:24 am Post subject: |
|
|
Thanks for your suggestion, but it's not exactly what I mean. Actually, I would like it most to find orphan pages (or files) even locally in my originating directory, not only at my remote site.
CU Heiko _________________ 42 |
|
Back to top |
|
|
|