-
Yeah, but you're going to compress the content before putting it into the png anyway, so it's a very reasonable approximation.
-
Señor Member
Suggestion: encode the length of the information into the first pixel as a uint.
Haikus are easy
But sometimes they don't make sense
Refrigerator
-
I actually came up with the same idea last night. I've implemented it, but I'm having some trouble with reconstituting things. I'll get it figured out soon.
-
If the png doesn't pan out you could try splitting and zipping the list (a.zip, b.zip, c.zip) then embedding the zipped data into individual classes, similar to sstalder idea:
class A.zipString = string content of a.zip (with the \ " ' and newline chars escaped)
class B.zipString = string content of b.zip
class C.zipString = string content of c.zip
...
Then use something like ASZip to inflate the zipStrings.
Edit: FZip to inflate, don't think you can inflate with ASZip.
Last edited by v5000; 04-04-2009 at 06:21 PM.
-
Señor Member
Are there only 27 characters? (A-Z and Linebreak?)
Haikus are easy
But sometimes they don't make sense
Refrigerator
-
I'm assuming you want to use these later to check if input is a word - if that's the case you can shave a lot of your filesize down by saving into 26 arrays (A words, B words, etc) and removing the first letter from each - so AARDVARK becomes A[0] = ARDVARK...it's a little goofy and specific to this problem but with 270k words that's 270k fewer letters you have to save and sort through.
-
If we're talking problem specific datastructures now, the list of words could be encoded in a kind of tree, where each node corresponds to a letter, and has a dictionary of children indexed by letter. So "aardvark" and "aardwolf" would share the first four nodes in the path and then diverge.
-
I split the word list into two zips, opened the zips as text, escaped \ " and newline chars then pasted the strings into a single class as two static constants. I tried using just one zip string but flash would crash at compile, which makes me think there might be a maximum length of chars per line, not a maximum number of lines.
Anyway, I couldn't figure out how to convert the zipStrings to byteArrays for inflation, the ByteArray class is a total mystery to me, so no surprise there. Also when opening/saving the zip as text I just chose utf-8 encoding, as I didn't know what else to choose. I tried using the nochump zip util. Here's the what I tried:
var byteArray1:ByteArray = new ByteArray();
byteArray1.writeUTFBytes(zip1String);
var zipFile:ZipFile = new ZipFile(byteArray1);
and nochump throws:
Error: invalid zip
at nochump.util.zip::ZipFile/findEND()
at nochump.util.zip::ZipFile/readEND()
at nochump.util.zip::ZipFile/readEntries()
at nochump.util.zip::ZipFile()
The swf comiles to a beefy 684 kb (which equals the size of the two zips), so the data is in there but I can't seem to make it usable.
-
Some good suggestions thanks. I hadn't thought about zipping - thanks for looking into it.
Once the word list is into an array, I do have a reasonably quick method for searching through the list which is the beginning of a tree structure. I'll probably look at optimising it some more later on.
Mavrisa: Yes, 26 letters plus line break
Last edited by _Ric_; 04-05-2009 at 03:38 PM.
-
Señor Member
In that case I'm gonna give the png util a shot as well
Haikus are easy
But sometimes they don't make sense
Refrigerator
-
OK, back to basics, got something very simple that appears to be working:
Code:
package{
public class LetterIndex {
public static const A:String="word1,word2,word3"; //all a words as csv
public static const B:String="";
public static const C:String="";
public static const D:String="";
public static const E:String="";
public static const F:String="";
public static const G:String="";
public static const H:String="";
public static const I:String="";
public static const J:String="";
public static const K:String="";
public static const L:String="";
public static const M:String="";
public static const N:String="";
public static const O:String="";
public static const P:String="";
public static const Q:String="";
public static const R:String="";
public static const S:String="";
public static const T:String="";
public static const U:String="";
public static const V:String="";
public static const W:String="";
public static const X:String="";
public static const Y:String="";
public static const Z:String="";
public static function getAllWords():Array {
var lettersArray:Array = new Array(A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z);
var allWordsString:String = lettersArray.join(",");
return allWordsString.split(",");
}
}
}
I tried this with using the "a" words list for all the static const strings because generating the letter groups as csv takes a lot of time and I'm not up for the grunt work. The resulting LetterIndex.as file is 4096 kb and the swf compiles fine.
In the fla I get the following results:
import LetterIndex;
var allWords:Array = LetterIndex.getAllWords();
trace( allWords.length ); //408798 -- the length of "a" words * 26
trace( allWords[ allWords.length-1 ] ); //AZYMS -- the last "a" word
Last edited by v5000; 04-05-2009 at 05:24 PM.
-
I've got a png util for arbitrary files. I'm doing some more testing, and will be putting it up shortly.
Mavrisa (and anyone else attempting the same), when you're manipulating bitmaps, be aware that the pixel data is stored in premultiplied format in a bitmapdata, meaning that if the alpha component is not ff, then the pixel values you retrieve won't necessarily be the same as the ones you put in. Very annoying.
Edit: Utility, code, and write-up are up on my blog:
http://cosmodro.me/blog/2009/apr/4/smuggle-png-utility/
Last edited by 5TonsOfFlax; 04-05-2009 at 08:19 PM.
-
Señor Member
Well that is quite amazing. I suppose I should have gone with byteArray. The operation I was trying took quite a while (manually writing 5 bits per character into each pixel :P). I'm just wondering though, is there any way to take out the individual 0's and 1's from a byteArray? You could get almost 2 and a half times smaller filesizes if it were possible..
Haikus are easy
But sometimes they don't make sense
Refrigerator
-
I wouldn't worry too much about that. In the case of the text file, I extracted it from the zip and encoded the bare text as a png. Because I used bytearray.compress, the png is 879k, while the original sowpods.txt file is 2.83MB. The zip is 686k, which fits in decently with the expected ratio between fully compressed and the png overhead (including the wasted green channel, which is where any improvement would truly lie).
I'm not aware of any easy way to read individual bits from a bytearray. But the as3ds stuff does include a BitVector which is conceptually very similar.
-
Wow - this looks awesome!! Your online utility works nicely with the wordlist, turning it into a lovely speckledy green image, and back to the word list file with no problem. I'm just taking a look at your code now, and trying to figure out how to get it working in my own test ap. It's not immediately obvious to me, once I've converted the image back to a byte array, how I'm going to turn that into a string (or an array of strings) representing the words. Is is as simple as myByteArray.toString() ?
Thanks for your help with this - I never expected so much!
-
I'm not sure whether .toString will turn a text file into the String in that file. I think there's a pretty good chance it will, actually.
To be honest, I had assumed that there was a URLLoader.loadBytes similar to Loader.loadBytes which you could simply drop in in place of your current dynamic loading. But I just checked and there isn't. Sometimes flash can be pretty stupid about what it provides and doesn't provide.
One approach you could take would be to create a swf which loads your file, turns it into a String or Array or whatever, and then writes that Object into a ByteArray (using .writeObject). Convert that ByteArray into a PNG using PNGSmuggler, and then embed that PNG. To get your Object back, just do a .readObject call on the reconstituted ByteArray.
I may add a TextField to copy and paste into, into the utility for this special case.
-
Ok I'll try that. I'm getting another problem at the moment though decompressing the byte array. I've created the png using your online utility, and imported into my test project. When I try to decompress it, I get an error 'Error #2058: There was an error decompressing the data.'. I've ensured the image I imported is set at 'lossless png'.
Maybe I'm doing something wrong - this is my code in it's entirety (I'm not importing any classes - I've just copy-pasted the bits of your code I thought were needed for this):
Code:
//==== turn the library image into bitmapData ====
var bitmapData:BitmapData;
var BitmapAsset:Class = getDefinitionByName( "sowpods.png" ) as Class;
bitmapData = new BitmapAsset( 0, 0 ) as BitmapData;
//=====================================
// decode bitmapData to byteArray
var bytes:ByteArray = decodePNG(bitmapData);
var myString = bytes.toString();
trace(myString.length);
function decodePNG(bmp:BitmapData):ByteArray {
var bytes:ByteArray = bitmapDataToByteArray(bmp);
//bytes.compress();
bytes.uncompress();
return bytes;
}
function bitmapDataToByteArray(bd:BitmapData):ByteArray {
var bytes:ByteArray = new ByteArray();
var bdbytes:ByteArray = bd.getPixels(bd.rect);
bdbytes.position = 0;
var length:int = bdbytes.readUnsignedInt() & 0xffffff;
for (var i:int = 0; i < length; i ++) {
var throwaway:uint = bdbytes.readUnsignedByte(); //throw away
bytes.writeByte(bdbytes.readUnsignedByte());
}
return bytes;
}
After googling the error, i found someone saying to compress the byte array again before decompressing. It now runs with no error, but the string that's returned is just a short line of gobbledigook - nothing like a word list. Although, the strings length is apparantly 676347 bytes, so there certainly seems to be something in there. I'll keep fiddling - perhaps I'm doing something silly.
<edit> Actually, compressing it again before decompressing just cancels out the decompressing - obviously. What I was getting was just the compressed byte array (which reads as wÚcœ둢;£ÿnÃ]lԵÞÅȞ). So the problem does seem to be decompressing. I suspect something may be happening to the png when it is imported into the library and converted to bitmapData, which is corrupting the byte array, making it uncompressable.
Last edited by _Ric_; 04-06-2009 at 11:11 AM.
-
The only thing I see that's immediately questionable in there is passing 0, 0 as the width and height to the bitmapData constructor. I don't know whether that would really have any effect.
Is that usually how you'd instantiate a graphic asset? I would have expected it to be a Bitmap rather than a BitmapData. But then, you should have got a 'cannot coerce' exception when you instantiated it. I think I remember reading something about the difference between casting with the Type() syntax and the as operator which might apply. Try doing
Code:
bitmapData = BitmapData(new BitmapAsset(width, height));
Compressing before uncompressing again would give you the same thing you've got in the first place, which should be a compressed bytearray. Which, of course, would look like gobbledygook. And that's about the right size for the compressed gobbledygook, since the zip file was about the same size.
I was getting the same "error decompressing the data" before I figured out that thing about the premultiplied pixel values, which indicated that the bytes coming out were not the same as those going in. I'm wondering if maybe Flash is somehow altering the image when embedding it in the library. Wouldn't that be a kick in the pants, making this whole effort useless.
-
Check this thread: http://www.actionscript.org/forums/s....php3?t=175956
Especially the bit about export for actionscript and instantiating it.
-
Out of curiosity, and my own selfish interest to add a spell checker to my RTE lib (although think I would go with an external zip that could be set per users language), I did the grunt work and I csv-ed the word list and split it by first letter. I don't know if there is a line limit to an .as file but I can tell you there definitely is a character limit to a statement. Anyway, if you want it Ric here you go:
http://www.troubledmonkey.com/sowpods.zip
Also, curious thing, the resultant swf is actually a little smaller than the original sowpods.zip.
Last edited by v5000; 04-06-2009 at 11:52 AM.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|