flat assembler
Message board for the users of flat assembler.

Index > Projects and Ideas > ANSI & ASCII File Cleaner

Author
Thread Post new topic Reply to topic
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Wipes out all unneeded (unprintable) characters (spaces, tabulation) at the end of line, leaving printable characters un touched.

USAGE: just drag'n'drop needed file (asm,inc,txt,h etc.) that consists of plain ANSI or ASCII text/code/data on the any exe (UNICODE or ANSI - both work correctly).
ANSI version is the best choise for old versions of Windows /poor Unicode support/.

EXAMPLE:
- say you have file with only spaces inside as input - after processing you'll get zero sized file.
- say you have file - " -[bla-bla !!] " as input - after processing you'll get " -[bla-bla !!]" as output.

such mechanism has FASM, but i noticed that it works imperfectly (not always correctly - some tabs & spaces are skipped some times by FASM).

[ADDED]
It's not harmful if you ocassionally put a unicode text file to process - Cleaner just checks for unicode and skips file if true. Checking is done by Byte-order mark.
Checks for UTF-8, UTF-16 & UTF-32 (Big & Little Endian).

Integration:
- you can add it to registry as i did. Just use Regedit.
Quote:

[HKEY_CLASSES_ROOT\*\shell\CleanOutFile\command]
@="C:\\WINDOWS\\Temp\\CleanFilesU.exe \"%1\""


Supported following filetypes (also text filetype dynamic check inside)
Code:
*.h,*.inc,*.asm,*.ini,*.txt,*.hpp,*.c,*.cpp,*.log,*.rc,*.def,*.bat,*.css,*.js,*.xml,*.vbs,*.idl,*.htm,*.html,*.nfo,*.diz    

and much more that have plain text inside.
So go on and clean them out )

    - Added additional checks for text inside. As a result - preventing from cleaning unappropriate format files.

    - Now preserves filetimes.

    - *nix text files support (with LF (10) only) /previous version just skipped them without changes/, Speed optimization

    - Added *.htm, *.html filetypes (as they're plain text files)

    - Improved command line parser

    - Added support of MS header files (*.h), which contain 0Ch symbol inside


Description: ANSI version for Win9x with poor unicode support
Download
Filename: CleanFilesA.zip
Filesize: 2.71 KB
Downloaded: 461 Time(s)

Description: UNICODE version for modern Windows OSes.
Download
Filename: CleanFilesU.zip
Filesize: 2.71 KB
Downloaded: 476 Time(s)


_________________
Any offers?


Last edited by asmfan on 11 Mar 2009, 17:06; edited 7 times in total
Post 05 Aug 2007, 19:49
View user's profile Send private message Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
New more secure version here.
Post 06 Aug 2007, 13:50
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
why don't you try to add unicode support? That might be good excercise for you, and make get this utility closer to real world usability...

At least UTF8 support, that shouldn't be such problem...
Post 06 Aug 2007, 13:57
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
actually i don't know much unicode files to process (except for modern *.reg files which in UTF-16 Big endian). All sources - c, cpp etc. use ansi encodings as i remember. If you kindly point me what to process by this i'll think of UTF importance.
Post 06 Aug 2007, 14:09
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
for example UTF8 FASM sources? Smile
i doubt there are any existing, i really meant it just as a kind of excercise...
Post 06 Aug 2007, 15:00
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
No, thanks vid Laughing
Until i find at least a file on my computer to optimize such a way i'd better read somethind interesting than programm something i cannot even test on Cool

_________________
Any offers?
Post 06 Aug 2007, 15:46
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
that is the problem... no one knows unicode, so no one programs with it, and thus we are still living in world full of 8 bit characters Smile
Post 06 Aug 2007, 16:46
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
vador



Joined: 12 Nov 2006
Posts: 68
Location: Madagascar
vador
what are really the advantages of UNICODE other ASCII and ANSI? is the speed increase when using unicode noticeable?
Post 07 Aug 2007, 13:24
View user's profile Send private message Reply with quote
vid
Verbosity in development


Joined: 05 Sep 2003
Posts: 7105
Location: Slovakia
vid
No, actually Unicode support causes considerable speed decrease.

Advantage is that everyone on world uses same character set, so you don't get couple of "?" or boxes if you try to read something written in other than yours.
Post 07 Aug 2007, 19:25
View user's profile Send private message Visit poster's website AIM Address MSN Messenger ICQ Number Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Updated.

_________________
Any offers?
Post 08 Aug 2007, 19:37
View user's profile Send private message Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Updated 11 august 2007.
Post 11 Aug 2007, 13:29
View user's profile Send private message Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Not an update, but addition to filetypes - htm and html added to list as they hold only plain text, that can be restucted according to Cleaner's algorithm.
If anybody knows addition filetypes that hold plain text - please write there to add them.
For few bytes sake;)
Post 16 Nov 2007, 07:23
View user's profile Send private message Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Improved command line parser
Post 10 Dec 2008, 08:50
View user's profile Send private message Reply with quote
asmfan



Joined: 11 Aug 2006
Posts: 392
Location: Russian
asmfan
Added support of MS header files (*.h), which contain 0Ch symbol inside ( )
Post 11 Mar 2009, 17:07
View user's profile Send private message Reply with quote
Display posts from previous:
Post new topic Reply to topic

Jump to:  


< Last Thread | Next Thread >
Forum Rules:
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Copyright © 1999-2020, Tomasz Grysztar. Also on GitHub, YouTube, Twitter.

Website powered by rwasa.