Friday, March 22, 2013
Saturday, March 16, 2013
Friday, March 1, 2013
Rapid file duplicator or copier
Recently, I was given a task to check the document processing capability of an application. The objective of the test was to check whether system can consume 2 million documents in one hour duration.
System is configured to consume the documents from the specified folder, but the question is how to create 2 million documents in a folder. Manually copying the files take lot of time and test need to be repeated multiple times. Windows file system don't work optimally when a folder contain more than 5000 files, so need to create sub-folders, each sub-folder containing 4000 files and in total 2 million files.
I have created simple VB script program that use copy method to create duplicate files, it took nearly 20 hours. Then I started redesigning the program that will run the copy method in multiple threads, so that task can be accomplished in 1 hour by utilizing the 100% CPU capacity.
It consists of two programs. Initiator program calls the duplicator program multiple times, so that each program runs in different thread and task is accomplished quickly. Program settings need to be tweaked as per the system configuration, so that threads run in an optimal way, not too many or too less threads.
Program download link
Program download link
Initiator.vbs
'Make sure you have enough space on the system, this program run in multiple threads by utilizing 100% cpu
'Just copy the files in any folder, it will automatically create subfolders
'Perform file partations in the optimal way.
FileName = "test.docx" 'Make sure file exist in the folder
NumberOfCopies = 100 'Make sure division with the below number gives reminder 0 - 1000000 (Actual Test)
NumberOfPartations = 10 'Number Of partaions or blocks for the above said file copies - 10000 (Actual Test)
TimeStampGenerationAfterCopies = 10 'Generate time stamp in the log file after creating som many files - 1000 (Actual Test)
NumberOfFilesInFolder = 4 'Number of files inside each folder - 3000 (Actual Test)
'-------------------------------------------------------------------------------------
'For appending zeros to the file name, so that files are sorted sequently
Zeros = len(NumberOfCopies)
Const ForAppending = 8
count = NumberOfCopies/NumberOfPartations
Set wshShell = CreateObject( "WScript.Shell" )
set fso=CreateObject("Scripting.FileSystemObject")
WorkingDirectory = fso.GetParentFolderName(Wscript.ScriptFullName)
'Check folder exist, else create the folder
strFolder = WorkingDirectory & "\Duplicates\"
If Not fso.FolderExists(strFolder) Then
fso.CreateFolder(strFolder)
End If
strFolder = WorkingDirectory & "\log\"
If Not fso.FolderExists(strFolder) Then
fso.CreateFolder(strFolder)
End If
Set MyFile = fso.OpenTextFile(WorkingDirectory & "\log\log.txt", ForAppending, True)
MyFile.WriteLine("###################################################")
MyFile.WriteLine("File Duplication Launch Start:" & funGetTimeStamp())
MyFile.WriteLine("Total Launches:" & count)
Start = 1
End1 = NumberOfPartations
for count1 = 1 to count
strFileName = WorkingDirectory & "\FileDuplicater.vbs" & " " & Start & " " & End1 & " " & count1 & " " & TimeStampGenerationAfterCopies & " " & NumberOfFilesInFolder & " " & FileName & " " & Zeros
wshShell.Run "wscript " & strFileName, 1, False
WScript.Sleep 3000
Start = Start + NumberOfPartations
End1 = End1 + NumberOfPartations
next
MyFile.WriteLine("File Duplication Launch End:" & funGetTimeStamp())
Set fso = Nothing
Set wshShell = Nothing
Function funGetTimeStamp()
sDateTIme = Now()
iDate = Datepart("d",sDateTime)
iLen = Len(iDate)
If iLen = 1 Then
iDate = "0" & iDate
End If
sMonth= mid(MonthName(Datepart("m",sDateTime)),1,3)
iYear = Datepart("yyyy",sDateTime)
iHour = Datepart("h",sDateTime)
iLen = Len(iHour)
If iLen = 1 Then
iHour = "0" & iHour
End If
iMinute = Datepart("n",sDateTime)
iLen = Len(iMinute)
If iLen = 1 Then
iMinute = "0" & iMinute
End If
iSec = Datepart("s",sDateTime)
iLen = Len(iSec)
If iLen = 1 Then
iSec = "0" & iSec
End If
funGetTimeStamp = sMonth & "_" & iDate & "_" & iYear & "_" & iHour & "_" & iMinute & "_" & iSec
End Function
FileDuplicator.vbs
'This program need to be called by Initiator.vbs that pass the necessary command line parameters
'Arguments
Set objArgs = WScript.Arguments
StartIndex = clng(objArgs(0))
EndIndex = clng(objArgs(1))
LaunchID = clng(objArgs(2))
TimeStampGenerationAfterCopies = clng(objArgs(3))
NumberOfFilesInFolder = clng(objArgs(4))
FileName = objArgs(5)
Zeros = clng(objArgs(6))
Set objArgs = Nothing
FolderIndex = 1
FileCount = 1
TimeStampGenerationAfterCopies1 = TimeStampGenerationAfterCopies
set fso=CreateObject("Scripting.FileSystemObject")
WorkingDirectory = fso.GetParentFolderName(Wscript.ScriptFullName)
strFolder = WorkingDirectory & "\Duplicates\" & LaunchID & "_" & FolderIndex & "\"
If Not fso.FolderExists(strFolder) Then
fso.CreateFolder(strFolder)
End If
Length = len(FileName)
JustFileName = Mid(FileName,1,Length-5) '.docx len 5
JUstFileExt = Mid(FileName,Length-4) 'docx len 4
LogFile = "\log\log_" & LaunchID & ".txt"
OrginalFileNamePath = WorkingDirectory & "\" & FileName
Set MyFile = fso.OpenTextFile(WorkingDirectory & LogFile, ForAppending, True)
MyFile.WriteLine("##########################################################")
MyFile.WriteLine("LaunchID:" & LaunchID & "---" & "Start:" & funGetTimeStamp())
MyFile.WriteLine("LaunchID:" & LaunchID & "---" & "StartIndex:" & StartIndex)
MyFile.WriteLine("LaunchID:" & LaunchID & "---" & " EndIndex:" & EndIndex)
MyFile.WriteLine("LaunchID:" & LaunchID & "---" & " TimeStamp Generated after number of files:" & TimeStampGenerationAfterCopies)
TimeStampCounter = TimeStampGenerationAfterCopies + StartIndex
if StartIndex = 1 then
else
StartIndex = StartIndex - 1
end if
Const ForAppending = 8
for count = StartIndex to EndIndex
'Logic to append zeros
FileIndexLength = len(count)
FileIndex = count
for count1 = 1 to (Zeros - FileIndexLength)
FileIndex = "0" & FileIndex
next
DuplicateFileNamePath = strFolder & JustFileName & "_" & FileIndex & JUstFileExt
fso.CopyFile OrginalFileNamePath, DuplicateFileNamePath , True
if TimeStampCounter = count then
MyFile.WriteLine("LaunchID:" & LaunchID & "---" & "Total Files Duplicated:" & TimeStampGenerationAfterCopies1 & "---" & funGetTimeStamp())
TimeStampCounter = TimeStampCounter + TimeStampGenerationAfterCopies
TimeStampGenerationAfterCopies1 = TimeStampGenerationAfterCopies1 + TimeStampGenerationAfterCopies
else
end if
if FileCount = NumberOfFilesInFolder then
FileCount = 0
FolderIndex = FolderIndex + 1
strFolder = WorkingDirectory & "\Duplicates\" & LaunchID & "_" & FolderIndex & "\"
If Not fso.FolderExists(strFolder) Then
fso.CreateFolder(strFolder)
End If
End If
FileCount = FileCount + 1
next
MyFile.WriteLine("LaunchID:" & LaunchID & "---" & " End:" & funGetTimeStamp())
MyFile.Close
Set MyFile = Nothing
'wscript.echo "File Duplication Completed. Total Files:" & NumberOfCopies
Set fso = Nothing
Function funGetTimeStamp()
sDateTIme = Now()
iDate = Datepart("d",sDateTime)
iLen = Len(iDate)
If iLen = 1 Then
iDate = "0" & iDate
End If
sMonth= mid(MonthName(Datepart("m",sDateTime)),1,3)
iYear = Datepart("yyyy",sDateTime)
iHour = Datepart("h",sDateTime)
iLen = Len(iHour)
If iLen = 1 Then
iHour = "0" & iHour
End If
iMinute = Datepart("n",sDateTime)
iLen = Len(iMinute)
If iLen = 1 Then
iMinute = "0" & iMinute
End If
iSec = Datepart("s",sDateTime)
iLen = Len(iSec)
If iLen = 1 Then
iSec = "0" & iSec
End If
funGetTimeStamp = sMonth & "_" & iDate & "_" & iYear & "_" & iHour & "_" & iMinute & "_" & iSec
End Function
Folder structure (Create below folders at any location in your file system)
---
Subscribe to:
Posts (Atom)