Why are you learning Python if you haven't needed it before?

Many of the tools I use have Python embedded right in there and I need to do things that the tools don't do natively.

I can read Python and understand it, but not really write it from scratch without stumbling through and reading lots of docs, StackOverflow, and GeeksForGeeks.

I need to manipulate some data using Alteryx that comes from an API call. For reasons that are dumb, but insurmountable, Alteryx cannot call this API directly.

Alteryx has Python embedded and even includes an embedded Jupyter IDE. I have Python code that makes the API calls and pulls down the data.

The only way to get the data out of the Python blob is using a pandas datastream. Any field that was a list or dictionary in the original data, is just a string when it comes out in the same pickle format you'd get if you did a print from Python.

Alteryx doesn't understand pickle format. It barely understand JSON. But I need to convert pickle to JSON and I cannot find a reliable way to do that just using character substitution and regular expressions.

I can't get the data out of Python in JSON format unless you know how to force pandas to represent the data as JSON instead of native Python dict and list. Alteryx properly converts the other datatypes to Alteryx-native datatypes, just not those.

There are tools in Alteryx to unnest objects and flatten arrays but they have to be valid JSON.

When I try just replacing ' with ", None with null, True with true, and False with false, I get errors which I assume are from replacing an embedded or escaped '.

Alteryx error messages are terrible. The input is 18,000 lines, each one a dict to be parsed. I convert them to what I think is valid JSON and send them to the JSON parser which returns this error Missing comma or '}' after an object member. at character position: 170. Not which line it happened on. Not any context where I can search or find it. Just that.

Would anyone have a reliable way to convert the strings of pickle to valid JSON?

#Python #Alteryx #JSON #PCRE #regex #Pickle

It was this.

"[concat(parameters('vmname'), '/', parameters('extensionname'))]"

Still, if anyone knows a cool and good way to detect single quotes embedded in a set of doublequotes and then escaping them using just pcre, that would be cool.

@jrdepriest You'll have to check this in Python, but in JavaScript strings and PCREs seem to allow arbitrary characters to be escaped using \x followed by the hexadecimal for the character.

$ node
Welcome to Node.js v18.19.1.
Type ".help" for more information.
> "\x27"
"'"
> /\x27/.test("a'b")
true
> /\x27/.test("ab")
false
>

@coderCyclist

I was able to get it working in Alteryx by using two regexes.

First REGEX_Replace([data], "'([^']*)': '([^']*)'", '"$1": "$2"') and then REGEX_Replace([data_parsed], "'([^']*)': "+'"([^"]*)"', '"$1": "$2"')

Edit: I didn't know I could use the + for string concatenation inside a REGEX_Replace command until today. That will come in handy.

@jrdepriest That's not in Pickle (Python-specific serialization) format. From a quick web search, looks like an Azure Resource Management template string. Maybe Azure has some generic helpers to deal with those?

https://learn.microsoft.com/en-us/azure/azure-resource-manager/templates/template-expressions

Template syntax and expressions - Azure Resource Manager

Describes the declarative JSON syntax for Azure Resource Manager templates (ARM templates).

@tero You are correct. The overall data is pickle but the API is returning that Azure string in lieu of an actual hostname for a single host out of over 18,000 with hundreds of other Azure VPCs. I don't know why it only screws up that one host.