A Flash Developer Resource Site

Page 2 of 3 FirstFirst 123 LastLast
Results 21 to 40 of 45

Thread: Embedding large text file

  1. #21
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    Yeah, but you're going to compress the content before putting it into the png anyway, so it's a very reasonable approximation.

  2. #22
    Señor Member Mavrisa's Avatar
    Join Date
    Oct 2005
    Location
    Canada
    Posts
    506
    Suggestion: encode the length of the information into the first pixel as a uint.
    Haikus are easy
    But sometimes they don't make sense
    Refrigerator

  3. #23
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    I actually came up with the same idea last night. I've implemented it, but I'm having some trouble with reconstituting things. I'll get it figured out soon.

  4. #24
    Senior Member
    Join Date
    May 2004
    Posts
    226
    If the png doesn't pan out you could try splitting and zipping the list (a.zip, b.zip, c.zip) then embedding the zipped data into individual classes, similar to sstalder idea:
    class A.zipString = string content of a.zip (with the \ " ' and newline chars escaped)
    class B.zipString = string content of b.zip
    class C.zipString = string content of c.zip
    ...

    Then use something like ASZip to inflate the zipStrings.

    Edit: FZip to inflate, don't think you can inflate with ASZip.
    Last edited by v5000; 04-04-2009 at 06:21 PM.

  5. #25
    Señor Member Mavrisa's Avatar
    Join Date
    Oct 2005
    Location
    Canada
    Posts
    506
    Are there only 27 characters? (A-Z and Linebreak?)
    Haikus are easy
    But sometimes they don't make sense
    Refrigerator

  6. #26
    Ө_ө sleepy mod
    Join Date
    Mar 2003
    Location
    Oregon, USA
    Posts
    2,441
    I'm assuming you want to use these later to check if input is a word - if that's the case you can shave a lot of your filesize down by saving into 26 arrays (A words, B words, etc) and removing the first letter from each - so AARDVARK becomes A[0] = ARDVARK...it's a little goofy and specific to this problem but with 270k words that's 270k fewer letters you have to save and sort through.

  7. #27
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    If we're talking problem specific datastructures now, the list of words could be encoded in a kind of tree, where each node corresponds to a letter, and has a dictionary of children indexed by letter. So "aardvark" and "aardwolf" would share the first four nodes in the path and then diverge.

  8. #28
    Senior Member
    Join Date
    May 2004
    Posts
    226
    I split the word list into two zips, opened the zips as text, escaped \ " and newline chars then pasted the strings into a single class as two static constants. I tried using just one zip string but flash would crash at compile, which makes me think there might be a maximum length of chars per line, not a maximum number of lines.

    Anyway, I couldn't figure out how to convert the zipStrings to byteArrays for inflation, the ByteArray class is a total mystery to me, so no surprise there. Also when opening/saving the zip as text I just chose utf-8 encoding, as I didn't know what else to choose. I tried using the nochump zip util. Here's the what I tried:

    var byteArray1:ByteArray = new ByteArray();
    byteArray1.writeUTFBytes(zip1String);
    var zipFile:ZipFile = new ZipFile(byteArray1);

    and nochump throws:

    Error: invalid zip
    at nochump.util.zip::ZipFile/findEND()
    at nochump.util.zip::ZipFile/readEND()
    at nochump.util.zip::ZipFile/readEntries()
    at nochump.util.zip::ZipFile()

    The swf comiles to a beefy 684 kb (which equals the size of the two zips), so the data is in there but I can't seem to make it usable.

  9. #29
    Senior Member
    Join Date
    Jan 2008
    Location
    UK
    Posts
    269
    Some good suggestions thanks. I hadn't thought about zipping - thanks for looking into it.

    Once the word list is into an array, I do have a reasonably quick method for searching through the list which is the beginning of a tree structure. I'll probably look at optimising it some more later on.

    Mavrisa: Yes, 26 letters plus line break
    Last edited by _Ric_; 04-05-2009 at 03:38 PM.

  10. #30
    Señor Member Mavrisa's Avatar
    Join Date
    Oct 2005
    Location
    Canada
    Posts
    506
    In that case I'm gonna give the png util a shot as well
    Haikus are easy
    But sometimes they don't make sense
    Refrigerator

  11. #31
    Senior Member
    Join Date
    May 2004
    Posts
    226
    OK, back to basics, got something very simple that appears to be working:
    Code:
    package{
    	public class LetterIndex {
    		public static const A:String="word1,word2,word3"; //all a words as csv
    		public static const B:String=""; 
    		public static const C:String="";
    		public static const D:String="";
    		public static const E:String="";
    		public static const F:String="";
    		public static const G:String="";
    		public static const H:String="";
    		public static const I:String="";
    		public static const J:String="";
    		public static const K:String="";
    		public static const L:String="";
    		public static const M:String="";
    		public static const N:String="";
    		public static const O:String="";
    		public static const P:String="";
    		public static const Q:String="";
    		public static const R:String="";
    		public static const S:String="";
    		public static const T:String="";
    		public static const U:String="";
    		public static const V:String="";
    		public static const W:String="";
    		public static const X:String="";
    		public static const Y:String="";
    		public static const Z:String="";
    		
    		public static function getAllWords():Array {
    			var lettersArray:Array = new Array(A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z);
    			var allWordsString:String = lettersArray.join(",");
    			
    			return allWordsString.split(",");
    		}
    	}
    }
    I tried this with using the "a" words list for all the static const strings because generating the letter groups as csv takes a lot of time and I'm not up for the grunt work. The resulting LetterIndex.as file is 4096 kb and the swf compiles fine.

    In the fla I get the following results:
    import LetterIndex;
    var allWords:Array = LetterIndex.getAllWords();
    trace( allWords.length ); //408798 -- the length of "a" words * 26
    trace( allWords[ allWords.length-1 ] ); //AZYMS -- the last "a" word
    Last edited by v5000; 04-05-2009 at 05:24 PM.

  12. #32
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    I've got a png util for arbitrary files. I'm doing some more testing, and will be putting it up shortly.

    Mavrisa (and anyone else attempting the same), when you're manipulating bitmaps, be aware that the pixel data is stored in premultiplied format in a bitmapdata, meaning that if the alpha component is not ff, then the pixel values you retrieve won't necessarily be the same as the ones you put in. Very annoying.

    Edit: Utility, code, and write-up are up on my blog:
    http://cosmodro.me/blog/2009/apr/4/smuggle-png-utility/
    Last edited by 5TonsOfFlax; 04-05-2009 at 08:19 PM.

  13. #33
    Señor Member Mavrisa's Avatar
    Join Date
    Oct 2005
    Location
    Canada
    Posts
    506
    Well that is quite amazing. I suppose I should have gone with byteArray. The operation I was trying took quite a while (manually writing 5 bits per character into each pixel :P). I'm just wondering though, is there any way to take out the individual 0's and 1's from a byteArray? You could get almost 2 and a half times smaller filesizes if it were possible..
    Haikus are easy
    But sometimes they don't make sense
    Refrigerator

  14. #34
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    I wouldn't worry too much about that. In the case of the text file, I extracted it from the zip and encoded the bare text as a png. Because I used bytearray.compress, the png is 879k, while the original sowpods.txt file is 2.83MB. The zip is 686k, which fits in decently with the expected ratio between fully compressed and the png overhead (including the wasted green channel, which is where any improvement would truly lie).

    I'm not aware of any easy way to read individual bits from a bytearray. But the as3ds stuff does include a BitVector which is conceptually very similar.

  15. #35
    Senior Member
    Join Date
    Jan 2008
    Location
    UK
    Posts
    269
    Wow - this looks awesome!! Your online utility works nicely with the wordlist, turning it into a lovely speckledy green image, and back to the word list file with no problem. I'm just taking a look at your code now, and trying to figure out how to get it working in my own test ap. It's not immediately obvious to me, once I've converted the image back to a byte array, how I'm going to turn that into a string (or an array of strings) representing the words. Is is as simple as myByteArray.toString() ?

    Thanks for your help with this - I never expected so much!

  16. #36
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    I'm not sure whether .toString will turn a text file into the String in that file. I think there's a pretty good chance it will, actually.

    To be honest, I had assumed that there was a URLLoader.loadBytes similar to Loader.loadBytes which you could simply drop in in place of your current dynamic loading. But I just checked and there isn't. Sometimes flash can be pretty stupid about what it provides and doesn't provide.

    One approach you could take would be to create a swf which loads your file, turns it into a String or Array or whatever, and then writes that Object into a ByteArray (using .writeObject). Convert that ByteArray into a PNG using PNGSmuggler, and then embed that PNG. To get your Object back, just do a .readObject call on the reconstituted ByteArray.

    I may add a TextField to copy and paste into, into the utility for this special case.

  17. #37
    Senior Member
    Join Date
    Jan 2008
    Location
    UK
    Posts
    269
    Ok I'll try that. I'm getting another problem at the moment though decompressing the byte array. I've created the png using your online utility, and imported into my test project. When I try to decompress it, I get an error 'Error #2058: There was an error decompressing the data.'. I've ensured the image I imported is set at 'lossless png'.

    Maybe I'm doing something wrong - this is my code in it's entirety (I'm not importing any classes - I've just copy-pasted the bits of your code I thought were needed for this):

    Code:
    //==== turn the library image into bitmapData ====
    
    var bitmapData:BitmapData;
    var BitmapAsset:Class = getDefinitionByName( "sowpods.png" ) as Class;
    bitmapData = new BitmapAsset( 0, 0 ) as BitmapData;
    
    //=====================================
    
    // decode bitmapData to byteArray
    var bytes:ByteArray = decodePNG(bitmapData);
    
    var myString = bytes.toString();
    trace(myString.length);
    
    
    function decodePNG(bmp:BitmapData):ByteArray {
    	var bytes:ByteArray = bitmapDataToByteArray(bmp);
    	//bytes.compress();
    	bytes.uncompress();
    	return bytes;
    }
    		
    function bitmapDataToByteArray(bd:BitmapData):ByteArray {
    	var bytes:ByteArray = new ByteArray();
    	var bdbytes:ByteArray = bd.getPixels(bd.rect);
    	bdbytes.position = 0;
    	var length:int = bdbytes.readUnsignedInt() & 0xffffff;
    	for (var i:int = 0; i < length; i ++) {
    	  var throwaway:uint = bdbytes.readUnsignedByte(); //throw away
    	  bytes.writeByte(bdbytes.readUnsignedByte());
    	}
    	return bytes;
    }
    After googling the error, i found someone saying to compress the byte array again before decompressing. It now runs with no error, but the string that's returned is just a short line of gobbledigook - nothing like a word list. Although, the strings length is apparantly 676347 bytes, so there certainly seems to be something in there. I'll keep fiddling - perhaps I'm doing something silly.

    <edit> Actually, compressing it again before decompressing just cancels out the decompressing - obviously. What I was getting was just the compressed byte array (which reads as wÚcœ둢;£ÿnÃ]lԵÞÅȞ). So the problem does seem to be decompressing. I suspect something may be happening to the png when it is imported into the library and converted to bitmapData, which is corrupting the byte array, making it uncompressable.
    Last edited by _Ric_; 04-06-2009 at 11:11 AM.

  18. #38
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    The only thing I see that's immediately questionable in there is passing 0, 0 as the width and height to the bitmapData constructor. I don't know whether that would really have any effect.

    Is that usually how you'd instantiate a graphic asset? I would have expected it to be a Bitmap rather than a BitmapData. But then, you should have got a 'cannot coerce' exception when you instantiated it. I think I remember reading something about the difference between casting with the Type() syntax and the as operator which might apply. Try doing
    Code:
    bitmapData = BitmapData(new BitmapAsset(width, height));
    Compressing before uncompressing again would give you the same thing you've got in the first place, which should be a compressed bytearray. Which, of course, would look like gobbledygook. And that's about the right size for the compressed gobbledygook, since the zip file was about the same size.

    I was getting the same "error decompressing the data" before I figured out that thing about the premultiplied pixel values, which indicated that the bytes coming out were not the same as those going in. I'm wondering if maybe Flash is somehow altering the image when embedding it in the library. Wouldn't that be a kick in the pants, making this whole effort useless.

  19. #39
    Will moderate for beer
    Join Date
    Apr 2007
    Location
    Austin, TX
    Posts
    6,801
    Check this thread: http://www.actionscript.org/forums/s....php3?t=175956

    Especially the bit about export for actionscript and instantiating it.

  20. #40
    Senior Member
    Join Date
    May 2004
    Posts
    226
    Out of curiosity, and my own selfish interest to add a spell checker to my RTE lib (although think I would go with an external zip that could be set per users language), I did the grunt work and I csv-ed the word list and split it by first letter. I don't know if there is a line limit to an .as file but I can tell you there definitely is a character limit to a statement. Anyway, if you want it Ric here you go:
    http://www.troubledmonkey.com/sowpods.zip

    Also, curious thing, the resultant swf is actually a little smaller than the original sowpods.zip.
    Last edited by v5000; 04-06-2009 at 11:52 AM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  




Click Here to Expand Forum to Full Width

HTML5 Development Center