View previous topic :: View next topic |
Author |
Message |
carpman Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 20 Jun 2002 Posts: 2202 Location: London - UK
|
Posted: Wed Jan 02, 2008 7:38 pm Post subject: courier-imap move duplicate file nightmare [solved] |
|
|
Hello, been doing some sys maintenance over holidays and noticed via thunderbird that one use had very large inbox so created archive folder and moved most of emails into this, as it was going to take a while i went home.
Later at home i was just checking mail sever at home i see 40% cpu usage and nearly 80gb of disk space has gone?
Top showed that it was the user i had moved emails for that was using cpu, i killed offending service before all disk space was gone.
Checking the uses mail box i see 10 duplcates mail for ervery email moved to new archive folder plus the users courier-imap tmp folder has 44gb of mail in it?
Code: |
ls -ls
total 7512
4 -rw-r--r-- 1 flack flack 43 Jan 2 16:29 courierimapacl
0 drwx------ 2 flack flack 72 Jan 2 19:31 courierimapkeywords
1069 -rw-r--r-- 1 flack flack 1090863 Jan 2 19:16 courierimapuiddb
1333 drwx------ 2 flack flack 1364744 Jan 2 19:12 cur
0 -rw------- 1 flack flack 0 Jan 2 16:29 maildirfolder
0 drwx------ 2 flack flack 48 Jan 2 16:29 new
5106 drwx------ 2 flack flack 5228936 Jan 2 19:31 tmp
|
From inside the courier-imap archive folder
Code: |
/root/scripts/dutop
81% 44.9G ./tmp
18% 9.9G ./cur
|
Anyone know how i can safely remove the duplicate files via console and complete the move of email from tmp to archive folder?
many thanks _________________ Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb
Last edited by carpman on Sat Jan 05, 2008 9:40 am; edited 1 time in total |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
steveb Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/1198140199438f3db8ee800.gif)
Joined: 18 Sep 2002 Posts: 4564
|
Posted: Wed Jan 02, 2008 8:25 pm Post subject: |
|
|
I use reformail to delete dupe messages from the command line. Should I post you a small script example?
// SteveB |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
carpman Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 20 Jun 2002 Posts: 2202 Location: London - UK
|
Posted: Wed Jan 02, 2008 9:27 pm Post subject: |
|
|
steveb wrote: | I use reformail to delete dupe messages from the command line. Should I post you a small script example?
// SteveB |
if you could please, the archive folder has about 14000 email most of which are dups, hate to think what the 44gb tmp folder has in it
At a loss as to why it created so many dups as this has never happened before?
cheers _________________ Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
steveb Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/1198140199438f3db8ee800.gif)
Joined: 18 Sep 2002 Posts: 4564
|
Posted: Wed Jan 02, 2008 9:44 pm Post subject: |
|
|
Just a example, but I think you get the point: Code: | #!/bin/bash
if [ -z "${1}" ]
then
echo "Syntax: ${0} <path to undupe>"
exit 1
fi
if [ ! -d "${1}" ]
then
echo "Path '${1}' not found"
exit 1
fi
UNDUPE_DIR="${1}"
DUPS_FILE="/tmp/$$$(date +%s%N 2>/dev/null)"
echo "${UNDUPE_DIR}"
echo " [before] : $(ls -1 --color=no "${UNDUPE_DIR}" | wc -l)"
find "${UNDUPE_DIR}"/ -maxdepth 1 -mindepth 1 -type f | while read foo
do
reformail -D 20000000 ${DUPS_FILE} <"${foo}" && rm -f "${foo}" >/dev/null 2>&1
done
echo " [after] : $(ls -1 --color=no "${MAIL_CUR_ENTRY}" | wc -l)"
[ -f "${DUPS_FILE}" ] && rm "${DUPS_FILE}" |
// SteveB |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
carpman Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 20 Jun 2002 Posts: 2202 Location: London - UK
|
Posted: Wed Jan 02, 2008 10:37 pm Post subject: |
|
|
Thanks fro reply.
umm not quite getting it as bash is not my best area.
Undupe is not in portage I take it would need to compile undupe?
What variables would also need to change apart from dir to check?
What are benefits of using reformail?
i ask as i have been looking at using fdupes along with suggestion from this page:
http://ubuntuforums.org/showthread.php?t=647883
steveb wrote: | Just a example, but I think you get the point: Code: | #!/bin/bash
if [ -z "${1}" ]
then
echo "Syntax: ${0} <path to undupe>"
exit 1
fi
if [ ! -d "${1}" ]
then
echo "Path '${1}' not found"
exit 1
fi
UNDUPE_DIR="${1}"
DUPS_FILE="/tmp/$$$(date +%s%N 2>/dev/null)"
echo "${UNDUPE_DIR}"
echo " [before] : $(ls -1 --color=no "${UNDUPE_DIR}" | wc -l)"
find "${UNDUPE_DIR}"/ -maxdepth 1 -mindepth 1 -type f | while read foo
do
reformail -D 20000000 ${DUPS_FILE} <"${foo}" && rm -f "${foo}" >/dev/null 2>&1
done
echo " [after] : $(ls -1 --color=no "${MAIL_CUR_ENTRY}" | wc -l)"
[ -f "${DUPS_FILE}" ] && rm "${DUPS_FILE}" |
// SteveB |
_________________ Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
steveb Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/1198140199438f3db8ee800.gif)
Joined: 18 Sep 2002 Posts: 4564
|
Posted: Wed Jan 02, 2008 10:42 pm Post subject: |
|
|
Are the message id's all the same for the dupe messages?
Going with a tool doing the MD5 sum for each file could be a solution as well.
// SteveB |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
carpman Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 20 Jun 2002 Posts: 2202 Location: London - UK
|
Posted: Wed Jan 02, 2008 11:25 pm Post subject: |
|
|
steveb wrote: | Are the message id's all the same for the dupe messages?
Going with a tool doing the MD5 sum for each file could be a solution as well.
// SteveB |
here is output from fdupes
which show one set of dup files
Code: |
/home/flack/.maildir/.2007-mail/cur/1199292138.M605731P6475V000000000000080AI000297C2_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199292007.M64702P6457V000000000000080AI0002921F_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291856.M603918P6447V000000000000080AI00028838_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291656.M920031P6426V000000000000080AI00027629_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291951.M909649P6452V000000000000080AI00028EE0_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291456.M118952P6364V000000000000080AI00025F03_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291722.M145683P6439V000000000000080AI00027CD7_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291588.M156938P6420V000000000000080AI00026F0F_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291792.M692307P6443V000000000000080AI000282BC_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291526.M182041P6416V000000000000080AI000268CF_288.mailserv,S=3522:2,S
|
_________________ Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
steveb Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/1198140199438f3db8ee800.gif)
Joined: 18 Sep 2002 Posts: 4564
|
Posted: Thu Jan 03, 2008 6:56 am Post subject: |
|
|
Looks like fdupes does what you need/want?
// SteveB |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
carpman Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
Joined: 20 Jun 2002 Posts: 2202 Location: London - UK
|
Posted: Sat Jan 05, 2008 9:40 am Post subject: |
|
|
I know its not the linux admin way but found a thunderbird extension to remove dupe emails, work well and quick, even clear out tmp dir do 50gb of dupes cleaned up in couple of minutes.
https://addons.mozilla.org/en-US/thunderbird/addon/4654
Will look into fdupes though for future use. _________________ Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
steveb Advocate
![Advocate Advocate](/images/ranks/rank-G-1-advocate.gif)
![](images/avatars/1198140199438f3db8ee800.gif)
Joined: 18 Sep 2002 Posts: 4564
|
Posted: Sat Jan 05, 2008 10:19 am Post subject: |
|
|
carpman wrote: | ... a thunderbird extension to remove dupe emails, work well and quick, even clear out tmp dir do 50gb of dupes cleaned up in couple of minutes. | Well... other email clients do have that as well. Sylpheed is one of them. Howerver... It is not command line and you wanted command line.
// SteveB |
|
Back to top |
|
![](templates/gentoo/images/spacer.gif) |
|