Gentoo Forums
Gentoo Forums
Gentoo Forums
Quick Search: in
courier-imap move duplicate file nightmare [solved]
View unanswered posts
View posts from last 24 hours

 
Reply to topic    Gentoo Forums Forum Index Networking & Security
View previous topic :: View next topic  
Author Message
carpman
Advocate
Advocate


Joined: 20 Jun 2002
Posts: 2202
Location: London - UK

PostPosted: Wed Jan 02, 2008 7:38 pm    Post subject: courier-imap move duplicate file nightmare [solved] Reply with quote

Hello, been doing some sys maintenance over holidays and noticed via thunderbird that one use had very large inbox so created archive folder and moved most of emails into this, as it was going to take a while i went home.

Later at home i was just checking mail sever at home i see 40% cpu usage and nearly 80gb of disk space has gone?

Top showed that it was the user i had moved emails for that was using cpu, i killed offending service before all disk space was gone.


Checking the uses mail box i see 10 duplcates mail for ervery email moved to new archive folder plus the users courier-imap tmp folder has 44gb of mail in it?


Code:

ls -ls
total 7512
   4 -rw-r--r-- 1 flack flack      43 Jan  2 16:29 courierimapacl
   0 drwx------ 2 flack flack      72 Jan  2 19:31 courierimapkeywords
1069 -rw-r--r-- 1 flack flack 1090863 Jan  2 19:16 courierimapuiddb
1333 drwx------ 2 flack flack 1364744 Jan  2 19:12 cur
   0 -rw------- 1 flack flack       0 Jan  2 16:29 maildirfolder
   0 drwx------ 2 flack flack      48 Jan  2 16:29 new
5106 drwx------ 2 flack flack 5228936 Jan  2 19:31 tmp


From inside the courier-imap archive folder
Code:

/root/scripts/dutop
81%     44.9G    ./tmp
18%      9.9G    ./cur



Anyone know how i can safely remove the duplicate files via console and complete the move of email from tmp to archive folder?

many thanks
_________________
Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb


Last edited by carpman on Sat Jan 05, 2008 9:40 am; edited 1 time in total
Back to top
View user's profile Send private message
steveb
Advocate
Advocate


Joined: 18 Sep 2002
Posts: 4564

PostPosted: Wed Jan 02, 2008 8:25 pm    Post subject: Reply with quote

I use reformail to delete dupe messages from the command line. Should I post you a small script example?

// SteveB
Back to top
View user's profile Send private message
carpman
Advocate
Advocate


Joined: 20 Jun 2002
Posts: 2202
Location: London - UK

PostPosted: Wed Jan 02, 2008 9:27 pm    Post subject: Reply with quote

steveb wrote:
I use reformail to delete dupe messages from the command line. Should I post you a small script example?

// SteveB


if you could please, the archive folder has about 14000 email most of which are dups, hate to think what the 44gb tmp folder has in it :(


At a loss as to why it created so many dups as this has never happened before?


cheers
_________________
Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb
Back to top
View user's profile Send private message
steveb
Advocate
Advocate


Joined: 18 Sep 2002
Posts: 4564

PostPosted: Wed Jan 02, 2008 9:44 pm    Post subject: Reply with quote

Just a example, but I think you get the point:
Code:
#!/bin/bash
if [ -z "${1}" ]
then
        echo "Syntax: ${0} <path to undupe>"
        exit 1
fi
if [ ! -d "${1}" ]
then
        echo "Path '${1}' not found"
        exit 1
fi

UNDUPE_DIR="${1}"
DUPS_FILE="/tmp/$$$(date +%s%N 2>/dev/null)"

echo "${UNDUPE_DIR}"
echo "  [before] : $(ls -1 --color=no "${UNDUPE_DIR}" | wc -l)"
find "${UNDUPE_DIR}"/ -maxdepth 1 -mindepth 1 -type f | while read foo
do
        reformail -D 20000000 ${DUPS_FILE} <"${foo}" && rm -f "${foo}" >/dev/null 2>&1
done
echo "  [after]  : $(ls -1 --color=no "${MAIL_CUR_ENTRY}" | wc -l)"

[ -f "${DUPS_FILE}" ] && rm "${DUPS_FILE}"


// SteveB
Back to top
View user's profile Send private message
carpman
Advocate
Advocate


Joined: 20 Jun 2002
Posts: 2202
Location: London - UK

PostPosted: Wed Jan 02, 2008 10:37 pm    Post subject: Reply with quote

Thanks fro reply.

umm not quite getting it as bash is not my best area.

Undupe is not in portage I take it would need to compile undupe?

What variables would also need to change apart from dir to check?


What are benefits of using reformail?

i ask as i have been looking at using fdupes along with suggestion from this page:

http://ubuntuforums.org/showthread.php?t=647883




steveb wrote:
Just a example, but I think you get the point:
Code:
#!/bin/bash
if [ -z "${1}" ]
then
        echo "Syntax: ${0} <path to undupe>"
        exit 1
fi
if [ ! -d "${1}" ]
then
        echo "Path '${1}' not found"
        exit 1
fi

UNDUPE_DIR="${1}"
DUPS_FILE="/tmp/$$$(date +%s%N 2>/dev/null)"

echo "${UNDUPE_DIR}"
echo "  [before] : $(ls -1 --color=no "${UNDUPE_DIR}" | wc -l)"
find "${UNDUPE_DIR}"/ -maxdepth 1 -mindepth 1 -type f | while read foo
do
        reformail -D 20000000 ${DUPS_FILE} <"${foo}" && rm -f "${foo}" >/dev/null 2>&1
done
echo "  [after]  : $(ls -1 --color=no "${MAIL_CUR_ENTRY}" | wc -l)"

[ -f "${DUPS_FILE}" ] && rm "${DUPS_FILE}"


// SteveB

_________________
Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb
Back to top
View user's profile Send private message
steveb
Advocate
Advocate


Joined: 18 Sep 2002
Posts: 4564

PostPosted: Wed Jan 02, 2008 10:42 pm    Post subject: Reply with quote

Are the message id's all the same for the dupe messages?
Going with a tool doing the MD5 sum for each file could be a solution as well.

// SteveB
Back to top
View user's profile Send private message
carpman
Advocate
Advocate


Joined: 20 Jun 2002
Posts: 2202
Location: London - UK

PostPosted: Wed Jan 02, 2008 11:25 pm    Post subject: Reply with quote

steveb wrote:
Are the message id's all the same for the dupe messages?
Going with a tool doing the MD5 sum for each file could be a solution as well.

// SteveB


here is output from fdupes
which show one set of dup files
Code:

/home/flack/.maildir/.2007-mail/cur/1199292138.M605731P6475V000000000000080AI000297C2_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199292007.M64702P6457V000000000000080AI0002921F_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291856.M603918P6447V000000000000080AI00028838_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291656.M920031P6426V000000000000080AI00027629_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291951.M909649P6452V000000000000080AI00028EE0_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291456.M118952P6364V000000000000080AI00025F03_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291722.M145683P6439V000000000000080AI00027CD7_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291588.M156938P6420V000000000000080AI00026F0F_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291792.M692307P6443V000000000000080AI000282BC_288.mailserv,S=3522:2,S
/home/flack/.maildir/.2007-mail/cur/1199291526.M182041P6416V000000000000080AI000268CF_288.mailserv,S=3522:2,S

_________________
Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb
Back to top
View user's profile Send private message
steveb
Advocate
Advocate


Joined: 18 Sep 2002
Posts: 4564

PostPosted: Thu Jan 03, 2008 6:56 am    Post subject: Reply with quote

Looks like fdupes does what you need/want?

// SteveB
Back to top
View user's profile Send private message
carpman
Advocate
Advocate


Joined: 20 Jun 2002
Posts: 2202
Location: London - UK

PostPosted: Sat Jan 05, 2008 9:40 am    Post subject: Reply with quote

I know its not the linux admin way but found a thunderbird extension to remove dupe emails, work well and quick, even clear out tmp dir do 50gb of dupes cleaned up in couple of minutes.

https://addons.mozilla.org/en-US/thunderbird/addon/4654


Will look into fdupes though for future use.
_________________
Work Station - 64bit
Gigabyte GA X48-DQ6 Core2duo E8400
8GB GSkill DDR2-1066
SATA Areca 1210 Raid
BFG OC2 8800 GTS 640mb
--------------------------------
Notebook
Samsung Q45 7100 4gb
Back to top
View user's profile Send private message
steveb
Advocate
Advocate


Joined: 18 Sep 2002
Posts: 4564

PostPosted: Sat Jan 05, 2008 10:19 am    Post subject: Reply with quote

carpman wrote:
... a thunderbird extension to remove dupe emails, work well and quick, even clear out tmp dir do 50gb of dupes cleaned up in couple of minutes.
Well... other email clients do have that as well. Sylpheed is one of them. Howerver... It is not command line and you wanted command line.

// SteveB
Back to top
View user's profile Send private message
Display posts from previous:   
Reply to topic    Gentoo Forums Forum Index Networking & Security All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum