The Vera Community forums have moved!

Advanced => Plugins & Plugin Development => Programming => openLuup => Topic started by: Buxton on January 20, 2019, 11:29:12 pm

Title: lu_block_jobs
Post by: Buxton on January 20, 2019, 11:29:12 pm
Hey AK,

Did you ever come across the titled command when you wrote openLuup?    http://wiki.micasaverde.com/index.php/Luup_Plugins_ByHand#.3Crun.2Fjob.2Fincoming.2Ftimeout.3E

I'm asking because I've been trying to solve a dropped connection problem that I keep having with a plugin I have installed, and it occurred to me that it may be the result of a race condition of some sort.  Basically, anytime the connection drops (IP TCP socket), the plugin tries to re-open the connection with an io.open to no avail.  Here's the code:
Code: [Select]
    luup.io.open(lul_device, PLUGIN_SETTINGS.PANEL_IP_ADDRESS, PLUGIN_SETTINGS.PANEL_IP_PORT)
luup.io.intercept(lul_device)
local wStatus = luup.io.write("^01,00$\r\n", lul_device)
local pData = luup.io.read(2, lul_device)

A  luup reload will often, but not always, clear the problem.  The drop happens most often after long periods of connectivity, or if another plugin in behaving badly (per error messages in the log.)  I can also clear the problem by logging into the exterior device via putty. Once cleared, the device will once again connect with openLuup on reload.

Anyway, just looking for advice and not any sort of action or investigation as part of what's going on here is probably related to the way the exterior device handles repeated connections......
Title: Re: lu_block_jobs
Post by: akbooer on January 21, 2019, 06:01:27 am
That link you gave was the most read Vera Wiki page ever when I was developing openLuup in the first place!

This is, undeniably, the worst implemented, worst documented, least understood, and error prone part of the Vera/MiOS Luup architecture.

Countless experiments by myself and others have failed to converge on the 'correct' implementation for openLuup.  Here are a few:


It's worth, perhaps, reading at least of bit of those threads to understand some of the issues.

One basic problem is lack of a luup.io.close() function, another is the detailed understanding of what luup.io.intercept() actually does, yet another is the differing requirements of devices once the connection is lost and needs to be re-established.

The solution generally adopted by plugin developers has been to abandon any attempt to use the luup.io module in favour of returning to the LuaSocket library and the functions available there.  You still have to overcome the semantic issue of what the particular device protocol requires.

Have you actually tried this on Vera and compared to openLuup?  I'm really interested in any observed discrepancies.  My expectation is that repeated calls to luup.io.open() will invariably fail.  The fact that a reload does not clear the problem, but external intervention with the device does, indicates that this is not perhaps an entirely openLuup issue?

Not an easy one.

______________

Edit: I should also note that I abandoned (early on) trying to emulate Vera exactly in the I/O implementation, witness the User Guide caveat that's been there from day one:

Quote
Doesn't support the <incoming> or <timeout> action tags in service files, 
but does support the device-level <incoming> tag.
Title: Re: lu_block_jobs
Post by: Buxton on January 22, 2019, 11:47:24 pm
OK, read through those very informative posts.  And will probably read through again as there were some insights in the comms protocol that might be at work here.  In the meantime, what do you make of the following.  These are sequential log entries that show the socket not being released--I'm assuming this as the same socket id is used.  Openluup does have an explicit close within the open function (I bracketed the close with log calls to get a sense of what's happening as well as creating a return variable for the close.)
Code: [Select]
_log("Now closing the socket for ".. tostring(device).. " ".. tostring(sock))
        local tempClose  = sock: close ()            -- close our end
_log("should now be closed for ".. tostring(device).. " ".. tostring(ip) .." ".. tostring(port) .." " .. tostring(tempClose))
        scheduler.socket_unwatch (sock)
        _log ("socket connection closed " .. tostring (sock))

Here are the relevant log entries:
Code: [Select]
2019-01-22 20:18:53.373   luup.io.open:: connecting to 10.17.2.30:4025, using crlf protocol tcp{client}: 0xaaab0c36f2a8
2019-01-22 20:18:53.373   openLuup.io:: connect OK

2019-01-22 20:19:03.143   luup.io.incoming:: bytes received: 6, status: OK tcp{client}: 0xaaab0c36f2a8
2019-01-22 20:19:03.143   luup_log:7: (EVL3VistaAlarmPanel::processLogin) EVL3 Interface requesting Login.
2019-01-22 20:19:03.143   luup.variable_set:: 7.urn:micasaverde-com:serviceId:EVL3VistaAlarmPanel1.Plugin_Status was: Startup completed. now: Start Login #hooks:0
2019-01-22 20:19:03.143   luup.io.write:: message length: 6, bytes sent: 6, status: OK tcp{client}: 0xaaab0c36f2a8
2019-01-22 20:19:03.144   luup.io.incoming:: bytes received: 10, status: OK tcp{client}: 0xaaab0c36f2a8

2019-01-22 20:19:03.151   openLuup.io:: Now closing the socket for 7 tcp{client}: 0xaaab0c36f2a8
2019-01-22 20:19:03.151   openLuup.io:: should now be closed for 7 10.17.2.30 4025 1
2019-01-22 20:19:03.152   openLuup.io:: socket connection closed tcp{client}: 0xaaab0c36f2a8

and then minutes later  after the explicit close, there is an attempt to open the non-existent socket

2019-01-22 20:22:04.556   luup.variable_set:: 7.urn:micasaverde-com:serviceId:EVL3VistaAlarmPanel1.ConnectionDown was: 0 now: 1 #hooks:1
2019-01-22 20:22:09.562   luup.io.write:: error: timeout tcp{client}: 0xaaab0c36f2a8
2019-01-22 20:22:11.565   luup.io.read:: error: timeout tcp{client}: 0xaaab0c36f2a8
2019-01-22 20:22:11.566   luup_log:7: EVL Data for io read while down nil iostatus false lul_device 7 lug_device 7

The attempt to open the non-existent socket is part of the code I posted in my opening question.

Is the release and re-acquire of A socket supposed to be handled in the device code??  If so, how to do so using luup.io.  I'm asking in the sense that I cannot envision doing anything different if I were to create my own socket connection.  I would not proceed differently as far as flags, timeouts etc ---from what is now being done internally by openluup.

As always, your insights greatly appreciated.
Title: Re: lu_block_jobs
Post by: Buxton on January 23, 2019, 12:03:16 am
And here's an example where the socket close occurs directly after a luup reload and what seems to be a successful data read:

Code: [Select]
2019-01-22 20:57:32.122   luup.io.incoming:: bytes received: 6, status: OK tcp{client}: 0xaaab0ce3aa18
2019-01-22 20:57:32.122   luup_log:7: (EVL3VistaAlarmPanel::processLogin) EVL3 Interface requesting Login.
2019-01-22 20:57:32.122   luup.variable_set:: 7.urn:micasaverde-com:serviceId:EVL3VistaAlarmPanel1.Plugin_Status was: Startup completed. now: Start Login #hooks:0
2019-01-22 20:57:32.123   luup.io.write:: message length: 6, bytes sent: 6, status: OK tcp{client}: 0xaaab0ce3aa18
2019-01-22 20:57:32.124   luup.io.incoming:: bytes received: 10, status: OK tcp{client}: 0xaaab0ce3aa18
2019-01-22 20:57:32.124   openLuup.http:: GET /data_request?id=lu_status2&output_format=json&DataVersion=219291509&Timeout=60&MinimumDelay=1500&_=1548216570158 HTTP/1.1 tcp{client}: 0xaaab0dad8d28
2019-01-22 20:57:32.125   openLuup.http:: GET /console?page=log HTTP/1.1 tcp{client}: 0xaaab0ddb9858
2019-01-22 20:57:32.125   openLuup.wsapi:: using openLuup/console.lua for console
2019-01-22 20:57:32.125   openLuup.wsapi:: using REQUIRE to load .lua CGI
2019-01-22 20:57:32.143   openLuup.http:: request completed (57475 bytes, 0 chunks, 18 ms) tcp{client}: 0xaaab0ddb9858
2019-01-22 20:57:32.144   openLuup.io:: Now closing the socket for 7 tcp{client}: 0xaaab0ce3aa18
2019-01-22 20:57:32.144   openLuup.io:: should now be closed for 7 10.17.2.30 4025 1
2019-01-22 20:57:32.144   openLuup.io:: socket connection closed tcp{client}: 0xaaab0ce3aa18
Title: Re: lu_block_jobs
Post by: akbooer on January 23, 2019, 10:39:46 am
The attempt to open the non-existent socket is part of the code I posted in my opening question.

Yes.  As I said...

My expectation is that repeated calls to luup.io.open() will invariably fail.

Regarding this...

And here's an example where the socket close occurs directly after a luup reload and what seems to be a successful data read:

...it would be the expected behaviour if the remote device closed the socket at its end after sending.  Given that this whole transaction happens in milliseconds, then I'm pretty sure that's what it's doing.

Let me ask again...

Have you actually tried this on Vera and compared to openLuup? 

...because I'm imagining that you will find the same behaviour there.

Perhaps what you are asking for is a DIFFERENT behaviour from Vera, wherein you can re-open a socket?
Title: Re: lu_block_jobs
Post by: Buxton on January 23, 2019, 03:12:55 pm
Yes, sorry I didn't answer about a Vera install.  Ironically, this very program that is giving me grief is the plugin that made me buy a Veralite in the first place as I wanted integration with my alarm system, and at the time, the EVL plugin was the only thing available.  With further irony, it is this plugin that led me to openLuup as when the plugin was loaded on my Vera, the machine would do a luup restart several times a day. Not to mention some oddities with Vera itself about reporting sensor states.  So it is impossible to tell what kind of behavior was occurring because the entire machine was unstable.  Needless to say, it worked out for the best as openLuup had been a lot of fun and has led me to all kinds of interesting integrations! That being said, I believe there are others using this plugin on Vera (it may work better on the faster machines with more memory.)

To the problem at hand--- one thing I'm not clear on is whether the same socket ID (in this case 0xaaab0ddb9858) is assigned to a socket within a given instance of openLuup regardless of whether it has been previously closed.  I'm guessing that the ID is a handle to the socket and not a hard identifier of an ip address + port combination as the id changes with a luup reload.  So part of the problem may be that a new socket handle needs to be created in the event of a socket close (closed internally by openLuup) that is then followed by a an open attempt. 

Per the millsecond timeout, this may also be the result of not supplying a valid login on a retry of the open.  That is certainly a code bug that I can fix with a hack so I need to look closer at the retries.  The EVL will immediately kick the connection if you try to write to the device while not logged in. 

One thing that I did find with testing is that if I increase the value of the IO timeout constants from 5 to 15, the plugin loads more reliably on a luup reload.  So maybe a partial fix would be to either allow a timeout value to be passed to the IO.open as an extra parameter, or to expose the timeouts so that they can be set dynamically in the luup variables.  The first approach would allow more customization per plugin.....
Title: Re: lu_block_jobs
Post by: akbooer on January 23, 2019, 05:59:22 pm
So it is impossible to tell what kind of behavior was occurring because the entire machine was unstable.

Ah.

Quote
To the problem at hand--- one thing I'm not clear on is whether the same socket ID (in this case 0xaaab0ddb9858) is assigned to a socket within a given instance of openLuup regardless of whether it has been previously closed.  I'm guessing that the ID is a handle to the socket and not a hard identifier of an ip address + port combination as the id changes with a luup reload.  So part of the problem may be that a new socket handle needs to be created in the event of a socket close (closed internally by openLuup) that is then followed by a an open attempt. 

Yes, this is exactly right.  The whole implementation was based on my understanding of Vera's behaviour, that a single socket is assigned to the device on startup (if an IP is given) or on io.open() and is not subsequently changed.  This is what I meant when I said that there is no luup.io.close().  Having come to a point where the socket has timed out or been closed from the remote end, there IS no routine in the defined API to close  the socket and allow another call to luup.io.open() and without doing that explicitly, there no way to handle any initial login/handshake that a new TCP connection may require.

Having said that, since the Vera implementation is so poor, and therefore the openLuup copy equally so, I'm not beyond changing the way it works for the better.  I just never got my head around HOW that would exactly work.  This may indeed be the very test case that I was looking for.

Quote
So maybe a partial fix would be to either allow a timeout value to be passed to the IO.open as an extra parameter, or to expose the timeouts so that they can be set dynamically in the luup variables.  The first approach would allow more customization per plugin.....

Yes, possibly.  I'm mostly against adding extra parameters to the Luup API (in case they add ones in the future) but, on the other hand, I have done that in extremis a couple of times.

I'm willing to take a look at this if you're able to test out some of my attempts.
Title: Re: lu_block_jobs
Post by: Buxton on January 23, 2019, 06:06:53 pm
Yes, I have a secondary linux machine that I use for testing so there's no problem with running untested openLuup versions.
Title: Re: lu_block_jobs
Post by: reneboer on January 24, 2019, 05:19:33 am
Hi,

I am currently thinking of testing the waters with luup.io and the incoming routine for a websocket connection rather then how I done it now with a socket and a one second call_delay checking for waiting data using sock.select. The remote side will close the connection after one minute idle time, so I can test what works. Last time I tried it seems that the raw protocol is not 100% raw causing some problems on Vera at least. Also using incoming for all incoming data proved to be sooo slow it was useless. So I want to try to use luup.io.read after receiving the first byte (or two). Testing a new luup.io.open is needed I would make depended on the result of luup.io.is_connected.

Once I have results for both Vera and openLuup I'll let you know.

Cheers Rene
Title: Re: lu_block_jobs
Post by: akbooer on January 24, 2019, 06:47:54 am
Excellent!  Many thanks for the offer.
Title: Re: lu_block_jobs
Post by: rigpapa on January 24, 2019, 01:55:08 pm
I'm following this conversation as well and working in the background on making WebSocket work with the Emby plugin, and a non-WebSocket continuous feed for Yeelight. I had both working with direct socket calls, and polling the socket, same as reneboer. Now I'm converting Emby to <incoming> and luup.io, to see if I can get something more like async behavior.

One thing I've noticed right away that does not work: if you have a child device, and handleChildren=1, and the children have their own sockets for talking to different endpoints (so you call luup.io.open with the child device number), you can luup.io.write() and luup.io..read() no problem, but I have not been successful getting incoming data routed through the parent implementation's <incoming> block to the child--I can see the data coming in (log level 52), but the incoming block is never called. If I make all the luup.io calls using the root device number, the incoming block starts working, but that's not a solution because it seems only one socket can be bound to a device in the luup.io model, so the parent can't maintain sockets for more than one child.  I would call this a (Vera) bug (perhaps incomplete implementation), because everything else on children seems to work, including the use of the <protocol> and <ioPort> tags in the device files for the child types.

I've got to run off to do some errands this afternoon, but when I return I'm going to see if <incoming> blocks on jobs for child devices work. That could be a workaround.
Title: Re: lu_block_jobs
Post by: akbooer on January 24, 2019, 03:37:58 pm
Not surprised it doesn't work for the child devices... no one ever told me that it should!  Are you sure that's the way it works on Vera?  Where. I GHG that be documented?

I lost the will to live trying to reverse engineer the Vera I/O, so I really stopped at the basics, but even then got frustrated by the lack of accurate documentation.  I ended up doing a fair amount of experimentation.

With your help, though, I?m happy to try and improve.!
Title: Re: lu_block_jobs
Post by: rigpapa on January 24, 2019, 04:37:02 pm
Not surprised it doesn't work for the child devices... no one ever told me that it should!  Are you sure that's the way it works on Vera?  Where. I GHG that be documented?

No, that's not openLuup. Sorry if I wasn't clear about that. It's Vera that's broken. Vera can't route for child devices through the top-level <incoming>. That's a real pain in the backside. Everything else for the child works--they can have different protocols and ports, etc. All fine. It's just the <incoming> routing that fails. So, for a solo device like an alarm panel, where there's only one endpoint to talk to, that's no problem, the parent can do all of the talking and everything works. But if you need a dozen sockets because you have a dozen children connected to a dozen endpoints... out of luck. The only thing that might save it is if the device reply includes some identifying information that would let you figure out which endpoint the message came from and dispatch it to the correct device, but for Yeelights, for example, no such luck. If Vera had even included the IP address of the peer in the incoming callback... hmmm... I wonder if that's hiding in there somewhere... time to go dump _G in the callback and see what it's got...

Quote
I lost the will to live trying to reverse engineer the Vera I/O, so I really stopped at the basics, but even then got frustrated by the lack of accurate documentation.  I ended up doing a fair amount of experimentation.

 ;D ;D ;D ;D ;D

It's not good. Especially with the above.

My bigger-scope prediction is that the bulk of IoT devices are going to be WiFi, and Z-Wave and its ilk will slowly be relegated to a very specific subset of devices. Time to market and cost will be drivers away from protocols like Z-Wave, not to mention the ubiquity of WiFi. So, for that to be at all sane, Vera needs to get its act together on sockets. LuaSocket under Luup doesn't cut it, and luup.io isn't showing well either. This all needs fresh eyes on it.
Title: Re: lu_block_jobs
Post by: rigpapa on January 24, 2019, 05:57:06 pm
So, good news for me, but probably not for you, @akbooer...

If, instead of trying to use the "global" incoming, I use <incoming> in a job action that runs on the child, the received data is dispatched correctly (on Vera) to the job (and therefore the child).

Code: [Select]
        <action>
            <serviceId>urn:toggledbits-com:serviceId:EmbyServer1</serviceId>
            <name>Update</name>
            <job>
                return 5,30 -- waiting for callback, 30 second timeout
            </job>
            <incoming>
                EmbyPlugin.handleIncoming( lul_data, lul_device )
                return 5,30 -- waiting for callback, 30-second timeout
            </incoming>
        </action>
Title: Re: lu_block_jobs
Post by: akbooer on January 25, 2019, 05:26:59 am
This is for a luup.io.open() issued on from the parent with the child device number?
Title: Re: lu_block_jobs
Post by: rigpapa on January 25, 2019, 07:22:59 am
This is for a luup.io.open() issued on from the parent with the child device number?

Yes, and same for IP assigned as attribute to the child device.
Title: Re: lu_block_jobs
Post by: reneboer on January 25, 2019, 09:15:23 am
Hi,

Got some testing done and basically I get the following on the intercept function. It basically seems to work as documented on Vera. That is:
 1) call intercept
 2) call write
 3) call read

What does not make a difference is to call intercept on just receiving. In my example I read the WebSocket protocol that is a few header bytes including the payload length and then that payload. So I tried to have the incoming handler process the header, and then use luup.io.read to read the payload. That basically does not work, and adding intercept makes no difference. I have to use the incoming handler to process the payload bytes.

The luup.io.is_connected is no good indication for an open connection. After the peer closes the socket from their end, is_connected still returns true. In my case the peer starts replying with an HTTP 500 message, so I could use that to restore the connection.

Next test is openLuup.

Cheers Rene

Title: Re: lu_block_jobs
Post by: rigpapa on January 25, 2019, 10:30:33 am
Hi,

Got some testing done and basically I get the following on the intercept function. It basically seems to work as documented on Vera. That is:
 1) call intercept
 2) call write
 3) call read

What does not make a difference is to call intercept on just receiving. In my example I read the WebSocket protocol that is a few header bytes including the payload length and then that payload. So I tried to have the incoming handler process the header, and then use luup.io.read to read the payload. That basically does not work, and adding intercept makes no difference. I have to use the incoming handler to process the payload bytes.

The luup.io.is_connected is no good indication for an open connection. After the peer closes the socket from their end, is_connected still returns true. In my case the peer starts replying with an HTTP 500 message, so I could use that to restore the connection.

Next test is openLuup.

Cheers Rene

Rene, my testing (also doing WebSocket) pretty much concurs. Trying identical things (using io.read() to get the payload data once I have the length) came to similar conclusions: switching to luup.io.read() for the payload yielded no benefit performance wise. I was able to get it to work using intercept() first, and noted that if I didn't call intercept() first, a deadlock often resulted. IMPORTANT: this is specifically from within the <incoming> handler. In other code, skipping the intercept() call does not result in a io.read() hang, you just run the risk of missing data.

That said, I have successfully used luup.io.read(), intercept() and <incoming> together, like this (in "raw" mode) to handle getting the connection upgraded to WebSocket:

1) Call luup.io.intercept();
2) Use luup.io.write() to send the HTTP request with headers to upgrade to Websocket;
3) Use luup.io.read() to gather response data, scanning for end of headers (consecutive cr/lf/crlf);
4) Celebrate successful conversion to WebSocket;
5) Send a WebSocket request (no-op is fine); this is necessary to reset the effect of the previous intercept();
6) Use <incoming> from that point forward.

I imagine this is similar to what you are doing.

But have you noticed the performance? Really, really poor. Appalling, actually. I removed all processing and just counted bytes and returned immediately from my incoming handler, trying to see how fast it would pass a 32KB JSON from the WebSocket endpoint, and the best I could do (on my Vera3) was just shy of 1800 bytes per second (18 seconds for the 32KB response--awful, just terrible). Receiving one byte at a time is just way too much overhead. I've already sent a note to Sorin requesting a "rawblock" protocol to return the entirety of a larger socket.receive() (e.g. 1024 bytes, connection MTU, or something even larger). But of course, that will be far off, if they agree to do it at all. For my application, as it is today it's a non-starter, as it takes 15-20 seconds to receive the average payload, but updates come every 5 seconds or less on an active endpoint--it can't keep up, not even close.

@akbooer, it would be great for openLuup if we could set a switch somewhere (openluup.longrawread = true?) to have the raw protocol return sock:recv(1024) or larger as described. At least then, performance on openLuup would shine over Vera (again).

Note: I also can't stress enough that, as a best practice, string concatenation should not be used to reassemble the packet in an incoming handler. String concats in Lua are slowwww, and get worse as the string grows. It's much faster to insert the data into a table, and then concat the table to a string when you have the entire payload.
Title: Re: lu_block_jobs
Post by: akbooer on January 25, 2019, 11:08:13 am
But have you noticed the performance? Really, really poor... Receiving one byte at a time is just way too much overhead.

This, of course, is why the <crlf> protocol is good, receiving a whole line at a time.  Unfortunately, you don't have that option for a binary stream.

Quote
@akbooer, it would be great for openLuup if we could set a switch somewhere (openluup.longrawread = true?) to have the raw protocol return sock:recv(1024) or larger as described. At least then, performance on openLuup would shine over Vera (again).

So what would terminate the read in the event of a shorter stream?

Quote
Note: I also can't stress enough that, as a best practice, string concatenation should not be used to reassemble the packet in an incoming handler. String concats in Lua are slowwww, and get worse as the string grows. It's much faster to insert the data into a table, and then concat the table to a string when you have the entire payload.

Amen to that!

My preference is to repackage some of the asynchronous I/O which is implemented in openLuup for servers, which quite happily handle incoming callbacks to jobs.  At the moment, this only works for TCP connections initiated by an incoming server socket, but there's no reason they couldn't apply to a specifically opened socket.  Currently, also, the callbacks are to a regular function, rather than code in an <incoming> block since this is much easier to set up.  The called routine receives a proxy socket as an argument to do with what it will.  Special handling ensures that the socket is correctly closed in the event of an error or a timeout and removed from the scheduler's socket watch list.

I think this would be far less frustrating rather than trying to invent a way that Vera/MiOS I/O should or might work in the future.  But you would, of course, be stuck with something that is openLuup-specific.

Title: Re: lu_block_jobs
Post by: rigpapa on January 25, 2019, 12:03:24 pm
@akbooer, it would be great for openLuup if we could set a switch somewhere (openluup.longrawread = true?) to have the raw protocol return sock:recv(1024) or larger as described. At least then, performance on openLuup would shine over Vera (again).

So what would terminate the read in the event of a shorter stream?


A non-blocking, no-timeout socket:receive( 1024 ) will return up to 1024 bytes, shorter if less data is pending on the socket:

Code: [Select]
sock:settimeout(0)
local data, err, partial = sock:receive( 1024 )
if data == nil then
   -- if partial not empty ("") dispatch to handler
   -- if err ~= "timeout" then handle error; for timeout, go do something else and come back later?
else
   -- dispatch data to incoming handler
end

EDIT: Fixed code. Sorry, rushing around today... third return parameter to receive is partial data if the timeout occurs, which it will almost immediately. Dispatch that to the incoming handler.
Title: Re: lu_block_jobs
Post by: akbooer on January 25, 2019, 12:19:42 pm
...so the easiest thing is simply to describe a new protocol option raw_1024 ?

__________

Edit: ah... I think that is what you were suggesting earlier.
Title: Re: lu_block_jobs
Post by: rigpapa on January 25, 2019, 12:23:01 pm
...so the easiest thing is simply to describe a new protocol option raw_1024 ?

Two reasons I would not do it this way:

1) It requires a change in the device file to use the different protocol, and thus you need different device files for Vera vs openLuup? Isn't that a first?

2) Vera owns that "namespace", and if it were me, I'd want to avoid potential conflicts in the (admittedly unlikely) event they someday create a "raw_1024" with different semantics. More principle than likely conflict, I'll admit.

Edit for clarity: I am recommending VERA add a new protocol. Until that's done, and you can know what it is and its particular implementation semantics (if they ever do it), I am recommending to you just a flag in openLuup to change the behavior of "raw" so that it can do either: the Vera way of returning 1 byte, or the openLuup way of returning a larger block. Then when Vera does whatever it may do, yours still works, and you can implement theirs when you get around to it.
Title: Re: lu_block_jobs
Post by: reneboer on January 25, 2019, 12:40:30 pm
Hi,

@Patrick, yes my logic is just like yours.

It was my experience that it is slow indeed. On a Vera Lite it takes about 20 seconds to get 75K of data. It is one I do not need often, but still. With native sockets it is 3-4 seconds. Most replies from the Harmony are well below 1KB and take a 1 sec max. I guess I can get it to work, but I then have to make some sending queue as well so a reply is handled before a next message gets send. The Harmony also sends status messages when it feels to (without doing a write first) I need to process.

The connection is now kept alive by sending a WS ping, so as long that works it is ok. However, when it stops and I try again to connect internal errors from LuaInterface.cpp start flying around. Same happens when trying a new WS handshake by intercept, write,read. Both are a sure way to trigger luup reloads.

For now I'm going to stick to working with sockets directly. This is too shaky for my taste.

For improved handling I would first ask for a functioning is_connected and a close and (re)connect possibility without taking luup down (thats an ask from Vera, not you AK). Also to be able to read specified number of bytes on the raw protocol would help, i.e. an extra parameter. Many needing raw will likely have a similar message format as websockets so you know what to look for.

Cheers Rene
Title: Re: lu_block_jobs
Post by: reneboer on January 25, 2019, 01:54:39 pm
BTW,

Just tested handling that large request with intercept and using luup.io.read. Five minutes !!!! to get that same 75KB. And yes, exact same logic as used in the incoming, just wrapped in a repeat and a luup.io.read to get each of those characters one by one.  So that's 10 time slower than incoming and utterly useless.

Cheers Rene
Title: Re: lu_block_jobs
Post by: HSD99 on January 25, 2019, 03:11:41 pm
Quote
My bigger-scope prediction is that the bulk of IoT devices are going to be WiFi, and Z-Wave and its ilk will slowly be relegated to a very specific subset of devices. Time to market and cost will be drivers away from protocols like Z-Wave, not to mention the ubiquity of WiFi. So, for that to be at all sane, Vera needs to get its act together on sockets. LuaSocket under Luup doesn't cut it, and luup.io isn't showing well either. This all needs fresh eyes on it.

I agree 100% with the above. I did a high-end residential wireless control system in the late 1990's (pre-Z-Wave, Zigbee, etc.) in the 900 MHz range with wallbox dimmers, switches and plug-in modules. A tiny cheap ~ 900 Mhz radio was all you could do in 1998---the idea of putting an 802.11b interface with software stack into a small device was simply not in the cards.

With HaLow (https://en.wikipedia.org/wiki/IEEE_802.11ah) coming along and cheap, low-power 802.11g/n interfaces integrated into small MCUs, Z-Wave, Zigbee, and Bluetooth et al will fade away from the HA and IoT marketplace.
Title: Re: lu_block_jobs
Post by: Buxton on January 26, 2019, 06:29:13 pm
On a similar note to the discussion of the <incoming> block, would it be possible to include a timeout tag of some sort in the device file.  Perhaps       <device><olTimeout>15</olTimeout></device> That way you wouldn't have to modify io.open to accept additional parameters. This would apply to  <incoming> data only and not any specific calls to io.open for files and what not. And if you allowed the overall luup.io timeout to be set in an exposed variable, that would take care of general need based on one's local network/machine response et al. 

I don't know enough about the UPnP spec to know if this would be disruptive to a plugin running on a vera-- in the sense of vera crashing because of an unrecognized UPnP variable.
Title: Re: lu_block_jobs
Post by: rigpapa on January 26, 2019, 07:01:05 pm
On a similar note to the discussion of the <incoming> block, would it be possible to include a timeout tag of some sort in the device file.  Perhaps       <device><olTimeout>15</olTimeout></device> That way you wouldn't have to modify io.open to accept additional parameters. This would apply to  <incoming> data only and not any specific calls to io.open for files and what not. And if you allowed the overall luup.io timeout to be set in an exposed variable, that would take care of general need based on one's local network/machine response et al. 

I don't know enough about the UPnP spec to know if this would be disruptive to a plugin running on a vera-- in the sense of vera crashing because of an unrecognized UPnP variable.

If you do your read/<incoming> in a <job>, the 5,X return from the job lets you specify the timeout (the X return parameter). This seems to work as advertised on Vera, IIRC, although I've had no luck doing something responsive when the timeout occurs. There is a <timeout> tag, which supposedly may contain Lua to execute when the timeout occurs, but I don't think that worked as planned when I last tried, and I could have just brain-faded it. I need to go back and play with it; that issue wasn't my focus at the time. The docs are just so vague and fragmented in this area. Ref: http://wiki.micasaverde.com/index.php/Luup_Declarations#.3Ctimeout.3E
Title: Re: lu_block_jobs
Post by: Buxton on January 26, 2019, 07:13:14 pm
Ok, but assuming <incoming> is first handled/started in the vera with something similar to io.open (using the socket library), is there a way to set the timeout for opening the socket in the device file?  This seems to be where the plugin I'm using stalls.  Once the socket is open, data flows as predicted, so I don't need to mess with the read timeout as that is handled by the plugin itself in the way you outlined.
Title: Re: lu_block_jobs
Post by: rigpapa on January 26, 2019, 07:48:59 pm
Ok, but assuming <incoming> is first handled/started in the vera with something similar to io.open (using the socket library), is there a way to set the timeout for opening the socket in the device file?  This seems to be where the plugin I'm using stalls.  Once the socket is open, data flows as predicted, so I don't need to mess with the read timeout as that is handled by the plugin itself in the way you outlined.

Yeah, no idea there. AFAIK, Vera just keeps on trying to open the socket until it succeeds. Even luup.io.open() doesn't have a timeout parameter, and it just returns and the socket is opened asynchronously (ouch). The docs basically say you need to write to it to see if the socket is open (fails if not). "...the actual opening of the socket occurs asynchronously and this function returns nothing. You will know that the socket opening failed if your subsequent call to write fails." SMH  :o