ScreenAnarchy: SHADOW Interview: Actress Sun Li Sheds Light on Zhang Yimou's Latest

In Shadow, actress Sun Li portrays a noblewoman trapped in a deadly game of deception, as she must pass a body double off as her husband, a great and powerful commander at odds with their king.   In an exclusive chat with LMD, Sun spoke about her role in the film, working beside her husband, actor/director Deng Chao, and joining the pantheon of director Zhang Yimou’s indomitable, unforgettable female characters.    The Lady Miz Diva:  First, please tell us how the role of Madam, the Commander’s wife, came to you?    Sun Li:  My husband, Deng Chao, was set to perform in this movie, and I was so excited about that.  Then a little bit of time passed, and Deng Chao told me that director...

[Read the whole post on screenanarchy.com...]

Slashdot: Jeff Bezos Unveils Lunar Lander To Take Astronauts To the Moon By 2024

An anonymous reader quotes a report from CNBC: Jeff Bezos, chairman of Amazon and founder of Blue Origin, unveiled his space company's lunar lander for the first time on Thursday. "This vehicle is going to the moon," Bezos said during an invite-only presentation to media and space industry executives. "We were given a gift -- this nearby body called the moon," Bezos said. He added that the moon is a good place to begin manufacturing in space due to its lower gravity than the Earth. Getting resources from the moon "takes 24 times less energy to get it off the surface compared to the Earth," Bezos said, and "that is a huge lever." Bezos also unveiled the company's BE-7 rocket engine at the event. The engine will be test fired for the first time this summer, Bezos said. "It's time to go back to the moon and this time stay," Bezos said. "I love Vice President Pence's 2024 lunar landing goal," Bezos said, adding that Blue Origin can meet that timeline "because we started this three years ago." Blue Origin's most visible program has been its New Shepard rocket system, which the company is developing to send tourists to the edge of space for 10 minutes. New Shepard has flown on 11 test flights, with its capsule, built to carry six passengers, reaching an altitude of more than 350,000. The capsule features massive windows, providing expansive views of the Earth once in space. The company plans to send its first humans onboard a New Shepard rocket sometime in the next year. But it has yet to begin selling tickets.

Read more of this story at Slashdot.

Disquiet: Disquiet Junto Project 0384: Breath Beat

Each Thursday in the Disquiet Junto group, a new compositional challenge is set before the group’s members, who then have just over four days to upload a track in response to the assignment. Membership in the Junto is open: just join and participate. (A SoundCloud account is helpful but not required.) There’s no pressure to do every project. It’s weekly so that you know it’s there, every Thursday through Monday, when you have the time.

Deadline: This project’s deadline is Monday, May 13, 2019, at 11:59pm (that is, just before midnight) wherever you are. It was posted in the afternoon, California time, on Thursday, May 9, 2019.

These are the instructions that went out to the group’s email list (at tinyletter.com/disquiet-junto):

Disquiet Junto Project 0384: Breath Beat
The Assignment: Explore breath as a resource for rhythm.

Just one step this week:

Step 1: Record yourself breathing, and make that the pulse of a piece of music.

Seven More Important Steps When Your Track Is Done:

Step 1: Include “disquiet0384” (no spaces or quotation marks) in the name of your track.

Step 2: If your audio-hosting platform allows for tags, be sure to also include the project tag “disquiet0384” (no spaces or quotation marks). If you’re posting on SoundCloud in particular, this is essential to subsequent location of tracks for the creation a project playlist.

Step 3: Upload your track. It is helpful but not essential that you use SoundCloud to host your track.

Step 4: Post your track in the following discussion thread at llllllll.co:

https://llllllll.co/t/disquiet-junto-project-0384-breath-beat/

Step 5: Annotate your track with a brief explanation of your approach and process.

Step 6: If posting on social media, please consider using the hashtag #disquietjunto so fellow participants are more likely to locate your communication.

Step 7: Then listen to and comment on tracks uploaded by your fellow Disquiet Junto participants.

Additional Details:

Deadline: This project’s deadline is Monday, May 13, 2019, at 11:59pm (that is, just before midnight) wherever you are. It was posted in the afternoon, California time, on Thursday, May 9, 2019.

Length: The length is up to you.

Title/Tag: When posting your track, please include “disquiet0384” in the title of the track, and where applicable (on SoundCloud, for example) as a tag.

Upload: When participating in this project, post one finished track with the project tag, and be sure to include a description of your process in planning, composing, and recording it. This description is an essential element of the communicative process inherent in the Disquiet Junto. Photos, video, and lists of equipment are always appreciated.

Download: Consider setting your track as downloadable and allowing for attributed remixing (i.e., a Creative Commons license permitting non-commercial sharing with attribution, allowing for derivatives).

For context, when posting the track online, please be sure to include this following information:

More on this 384th weekly Disquiet Junto project — Breath Beat / The Assignment: Explore breath as a resource for rhythm — at:

https://disquiet.com/0384/

More on the Disquiet Junto at:

https://disquiet.com/junto/

Subscribe to project announcements here:

http://tinyletter.com/disquiet-junto/

Project discussion takes place on llllllll.co:

https://llllllll.co/t/disquiet-junto-project-0384-breath-beat/

There’s also on a Junto Slack. Send your email address to twitter.com/disquiet for Slack inclusion.

Image associated with this project adapted (cropped, colors changed, text added, cut’n’paste) thanks to a Creative Commons license from a photo credited to Victor Morell Perez:

https://flic.kr/p/4M5zUQ

https://creativecommons.org/licenses/by-nc/2.0/

Slashdot: Microsoft Recommends Using a Separate Device For Administrative Tasks

In a rare article detailing insights about its staff's efforts in securing its own internal infrastructure, Microsoft has shared some very insightful advice on how companies could reduce the risk of having a security breach. From a report: The central piece of this article is Microsoft's recommendation in regards to how companies should deal with administrator accounts. Per Microsoft's Security Team, employees with administrative access should be using a separate device, dedicated only for administrative operations. This device should always be kept up to date with all the most recent software and operating system patches, Microsoft said. "Provide zero rights by default to administration accounts," the Microsoft Security Team also recommended. "Require that they request just-in-time (JIT) privileges that gives them access for a finite amount of time and logs it in a system." Furthermore, the OS vendor also recommends that administrator accounts should be created on a separate user namespace/forest that cannot access the internet, and should be different from the employee's normal work identity.

Read more of this story at Slashdot.

Recent additions: servant-purescript 0.9.0.3

Added by eskimo, Thu May 9 21:59:21 UTC 2019.

Generate PureScript accessor functions for you servant API

MetaFilter: My Cousin Was My Hero. Until the Day He Tried to Kill Me.

A man grapples with his own indroctination in toxic masculinity After a terrible attack by a family member, NYT writer Wil S. Hylton takes stock of what being a man has meant to him, his family and what it could mean to his children. Trigger warning - violence

MetaFilter: "They looked so great when they played." - Charlie Watts

Sharp suits, thin ties and the coolest musicians on Earth is an appreciation by the Guardian's Richard Williams of BBC Two's Jazz 625 series of concerts, which were all recorded in 1964 and '65, featuring the giants of the jazz scene, from Dizzy Gillespie to the Modern Jazz Quartet. This is a good sampler of music from the show, but a few whole episodes are available online, and I've put links to the ones I found below the cut.

Bill Evans Trio
Art Blakey and the Jazz Messengers
Dizzy Gillespie Quintet
Oscar Peterson Quintet
The Modern Jazz Quartet
Dave Brubeck Quartet
Thelonious Monk & part 2
Coleman Hawkins
Jimmy Smith
Erroll Garner & part 2
Clark Terry
Ben Webster
Willie "The Lion" Smith
Alex Walsh

MetaFilter: "I would not have expected to agree with a radical lesbian feminist."

At Jezebel, Esther Wang reports on the close ties forged between conservatives and trans-exclusionary radical feminists (TERFs) along with the transphobic legislation they have been able to push through together.

MattCha's Blog: Fine Tuning The Speed Test

One of my favorite blog posts is this article by Marshal’N’ of A Tea Addicts Journal where he introduces The Speed Test as “a more honest and straightforward method of determining whether you like a tea or not.”  Its method is simple- the teas you drink through the fastest are likely the teas that you enjoy the most.

I find there is a lot of truth to this measure.  I really like it to the point I’m considering publishing how much of certain cakes I drink through in a year as a measure of their value.  Actually, I have even came up with my own straightforward/ practical metric that follows from and increases the reliability of The Speed Test, The Re-Order Test (see comment here).  The Re-Order Test is also practical way of assessing the subjective value of a tea. 

Anyhow, there are a few consideration about The Speed Test that you might want to take into consideration for it to be a more valuable measure…

First, the tea that you are speed testing has to be readily available for daily consumption.  It can’t be locked away in some complex storage vault.  It has to be easy to get at.

Second, it has to be something that can be easily re-ordered.  If you can’t ever re-order it again and you really like it and want to age it, you might be more reluctant to drink through it as quickly.  As Marshal’N puts it in that article, “I have to control myself from drinking, lest I run out of it.”

Thirdly, and most importantly, you have to consider the stamina and potency of the tea you are drinking. 

What do I mean by stamina? - How many enjoyable nfusions can you get out of the dry leaf? 

What do I mean by potency? - How many grams of dry leaf do you need to get the maximum amount of enjoyable infusions? 

The above considerations will drastically impact how quickly you drink through your tea.  It might even skew the results of the Speed Test.  I know it influences how fast I drink though my puerh.

Peace
 

Bifurcated Rivets: From FB

Gorgeous

Bifurcated Rivets: From FB

This is wonderful

Bifurcated Rivets: From FB

Good stuff

Bifurcated Rivets: From FB

I want one of these.

Bifurcated Rivets: From FB

Shuudan Koudou

Slashdot: Hackers Breached 3 US Antivirus Companies, Researchers Reveal

In a report published Thursday, researchers at the threat-research company Advanced Intelligence (AdvIntel) revealed that a collective of Russian and English-speaking hackers are actively marketing the spoils of data breaches at three US-based antivirus software vendors. From a report: The collective, calling itself "Fxmsp," is selling both source code and network access to the companies for $300,000 and is providing samples that show strong evidence of the validity of its claims. Yelisey Boguslavskiy, director of research at AdvIntel, told Ars that his company notified "the potential victim entities" of the breach through partner organizations; it also provided the details to US law enforcement. In March, Fxmsp offered the data "through a private conversation," Boguslavskiy said. "However, they claimed that their proxy sellers will announce the sale on forums."

Read more of this story at Slashdot.

Recent additions: aeson-gadt-th 0.2.1.0

Added by abrar, Thu May 9 21:22:58 UTC 2019.

Derivation of Aeson instances for GADTs

ScreenAnarchy: I AM MOTHER Trailer: Hilary Swank Stars in Netflix Sci-Fi Thriller

Netflix have released the trailer for Grant Sputore's sci-fi thriller I Am Mother, which will stream on the service starting June 7, 2019. The claustrophobic film stars Clara Rugaard and a teenaged girl who is the first of a new generation of humans to be raised by "Mother" (voiced by Rose Byrne), a robot designed to repopulate the earth after the extinction of humankind. But the pair's unique relationship is threatened when an injured stranger (played by Hilary Swank) arrives with news that calls into question everything Daughter has been told about the outside world. I Am Mother is written by Michael Lloyd Green. It premiered at the 2019 Sundance Film Festival where it received generally positive buzz and was scooped up by Netflix. Check out...

[Read the whole post on screenanarchy.com...]

Slashdot: Millions of People Uploaded Photos To the Ever App. Then the Company Used Them To Develop Facial Recognition Tools.

An anonymous reader shares a report: "Make memories": That's the slogan on the website for the photo storage app Ever, accompanied by a cursive logo and an example album titled "Weekend with Grandpa." Everything about Ever's branding is warm and fuzzy, about sharing your "best moments" while freeing up space on your phone. What isn't obvious on Ever's website or app -- except for a brief reference that was added to the privacy policy after NBC News reached out to the company in April -- is that the photos people share are used to train the company's facial recognition system, and that Ever then offers to sell that technology to private companies, law enforcement and the military. In other words, what began in 2013 as another cloud storage app has pivoted toward a far more lucrative business known as Ever AI -- without telling the app's millions of users.

Read more of this story at Slashdot.

MetaFilter: This Product Has Been Tested By Animals

Specifically, it's been tested by grizzly bears. From Great Big Story, Montana's Bear and Wolf Discovery Center shows just what it means for a cooler or trash can to be bear-resistant.

MetaFilter: The Irregular Outfields of Baseball

"Baseball is a sport rooted in rules and regulations. Everything in the game is standardized, planned, and coordinated, based on a guideline or precedent. Everything, that is, but the park itself: outfield sizes and wall heights vary across the entire league. There are 30 MLB stadiums. No two of them are alike."

If you like this, here's every stadium from the article in one handy graphic. And check out Clem's Baseball for side-by-side comparisons of ballparks both past and present.

OUR VALUED CUSTOMERS: While looking through the dollar bins...

Please buy my book. It's on sale at:

ScreenAnarchy: Umberto Lenzi's THE TOUGH ONES To Make Its Blu Debut From Grindhouse Releasing July 9th

They may not be terribly prolific, but when Grindhouse Releasing announces a new release, it's definitely worthwhile to listen up. Grindhouse, purveyors of quality horror and exploitation cinema since the late '90s, have recently announced that they've completed a 4k restorations of Umberto Lenzi (Cannibal Ferox, Spasmo, Eyeball) Eurocrime classic The Tough Ones, also known by the incredibly badass title, Rome Armed To The Teeth. This upcoming Blu-ray will not only include the brand new restoration, but also a host of special features including an audio commentary from Eurocrime expert Mike Malloy, interviews with many of the cast and crew, Roberto Curti liner notes, a CD of the soundtrack and much more! This 3-disc set is going to be a scorcher! The first 2,500 copies...

[Read the whole post on screenanarchy.com...]

Slashdot: FCC Blocks China Mobile From Operating in US Over National Security Concerns

The FCC has voted unanimously to deny China Mobile's application to provide telecommunications services in the U.S. due to concerns about national security and law enforcement risks. From a report: China Mobile's application, submitted in 2011, requested permission to provide telecom services in the U.S., including connecting calls between the U.S. and the vast majority of countries, which would involve interconnecting with American internet networks. FCC officials say China Mobile USA is indirectly and ultimately owned and controlled by the Chinese government. It's a subsidiary of global telecom giant China Mobile Limited. U.S. officials saw risks that China Mobile would comply with government espionage requests or that information about U.S. communications networks and users could be exploited. There were also concerns that Chinese government officials could use its access to U.S. networks to block or interfere with communications traffic should an issue arise between the two countries.

Read more of this story at Slashdot.

Recent additions: generics-sop 0.5.0.0

Added by AndresLoeh, Thu May 9 20:01:20 UTC 2019.

Generic Programming using True Sums of Products

Recent additions: sop-core 0.5.0.0

Added by AndresLoeh, Thu May 9 20:00:52 UTC 2019.

True Sums of Products

Recent additions: basic-sop 0.2.0.3

Added by AndresLoeh, Thu May 9 19:37:39 UTC 2019.

Basic examples and functions for generics-sop

new shelton wet/dry: ‘Elle lui rendit son baiser. L’appartement tourna. Il n’y voyait plus.’ –Gustave Flaubert, Bouvard et Pécuchet

Sometime after four in a dark and cold Italian morning, a young woman accompanied a band of men to a duck shoot. After it was over and the frigid hunters sat by the fire, the eighteen-year old Adriana Ivancich, the only woman in the gathering, asked for a comb for her long black hair. [...]

Colossal: A Field Recording by Phil Torres Documents the Waterfall-like Sound of Millions of Migrating Monarch Butterflies

Entomologist and TV host Phil Torres (previously) dives deep into the natural world to document sights and sounds that many of us will never have a chance to experience firsthand. In his most recent video, Torres showcases the sound created by millions of migrating monarchs. The iconic orange and black butterflies convene every year in Mexico, where they overwinter during the Northern Hemisphere’s cooler months. In Torres’ six minute video, monarchs cluster by the thousands on individual tree branches and swarm the forest air, creating a rushing, waterfall-like sound. We highly recommend listening to the video with a pair of earphones to really pick up the subtleties in the audio. You can see more of Torres’ outdoor explorations on his Youtube channel, The Jungle Diaries, and follow along on Twitter.

new shelton wet/dry: *swipes right*

The convergence in physical appearance hypothesis posits that long-term partners’ faces become more similar with time as a function of the shared environment, diet, and synchronized facial expressions. While this hypothesis has been widely disseminated in psychological literature, it is supported by a single study of 12 married couples. Here, we examine this hypothesis using [...]

new shelton wet/dry: Flashbacks hit me right between the eyes

Physical pain represents a common feature of Bondage and Discipline/Dominance and Submission/Sadism and Machochism (BDSM) activity. This article explores the literature accounting for how painful stimuli may be experienced as pleasurable among practitioners of BDSM, and contrasting this with how it is experienced as painful among non-BDSM individuals. […] The experience of pain in this [...]

new shelton wet/dry: Prettymaide hues may have their cry apple, bacchante, custard, dove, eskimo, fawn, ginger, hemalite isinglass, jet, kipper, lucile, mimosa, nut, oysterette, prune, quasimodo, royal, sago, tango, umber, vanilla, wistaria, xray, yesplease, zaza, philomel, theerose. What are they all by? Shee.

Faces are a primary source of social information, but little is known about the sequence of neural processing of personally relevant faces, such as those of our loved ones. We applied representational similarity analyses to EEG-fMRI measurement of neural responses to faces of personal relevance to participants – their romantic partner and a friend – [...]

ScreenAnarchy: Review: TOLKIEN, Portrait of a Slowly Boring Talent

Nicholas Hoult stars with Lily Collins in a biographical drama about the writer's younger years, directed by Dome Karukoski.

[Read the whole post on screenanarchy.com...]

new shelton wet/dry: Every day, the same, again

Electric vehicles emit more CO2 than diesel ones, German study shows French man arrives in Caribbean after crossing Atlantic in giant barrel A “cyber event” interrupted grid operations in parts of the western United States last month, according to a cryptic report posted by the Department of Energy. If remote hackers interfered with grid networks [...]

ScreenAnarchy: Filmmaker Cameo Wood nominated for regional EMMY

Filmmaker Cameo Wood has been nominated for a Northern California Area EMMY Award for her work as producer of the project "Real Artists.” This is the latest of many honors Wood has received for the film, which she also wrote and directed. In 2018 she was awarded an AT&T Film Award for Best Short by an Emerging Filmmaker, and "Real Artists" has played in over 100 film festivals.  "Real Artists" is a science fiction story about a young woman who scores a dream job at an animation studio, only to learn all is not as she expects behind the curtain. It explores the idea of what it means to be a "real artist" in the age of artificial intelligence. The film was created with a majority female...

[Read the whole post on screenanarchy.com...]

Open Culture: Tangled Up in Blue: Deciphering a Bob Dylan Masterpiece

Dylan’s “Tangled Up in Blue” strikes a middle point between his more surreal lyrics of the ‘60s and his more straightforward love songs, and as Polyphonic’s recent video taking a deep dive into this “musical masterpiece” shows, that combination is why so many count it as one of his best songs.

It is the opening track of Blood on the Tracks, the 1975 album that critics hailed as a return to form after four middling-at-best albums. (One of them, Self-Portrait, earned Dylan one of critic Greil Marcus’ best known opening lines: “What is this shit?”--in the pages of Rolling Stone no less.)



Blood on the Tracks is one of the best grumpy, middle-age albums, post-relationship, post-fame, all reckoning and accountability, a survey of the damage done to oneself and others, and “Tangled” is the entry point. Dylan’s marriage to Sara Lowndes Dylan was floundering after eight years--affairs, drink, and drugs had estranged the couple. Dylan would later say that “Tangled” “took me ten years to live and two years to write.”

It would also take him two studios, two cities, and two band line-ups to get working. A version recorded in New York City is slower, lower (in key), and more like one of his guitar-only folk tunes. In December of 1974, Dylan returned home to Minnesota and played the songs to his brother, who wasn’t impressed and suggested he rerecord. The version we know is faster, brighter, janglier, and as Polyphonic explains, sung at a key nearly too high for Dylan. But it’s that wild, near exasperation of reaching those notes that gives the song its lifeblood.

And he also reworked the lyrics, removing whole verses and changing others, until the finished version is, indeed, tangled. It jumps back and forth from present to past to wishful future, verse to verse, and even line to line.

The pronouns change too--the “she” is sometimes the lost love, sometimes a woman who reminds the singer of the former. The further he goes to get away from his first love, the more he meets visions of her elsewhere.

Then there’s the details of the travels and the jobs the narrator takes on, leaving fans to parse which are true and which are not (Sara Lowndes, for example, was working at a Playboy club--the “topless place”--when he met her). And even if we could know who the man is in verse six who “started into dealing with slaves”...would it make any difference?

In the end the song feels universal because it is both so specific and so intentionally confounding. “Tangled Up in Blue” affects so many of its listeners, yours truly included, because it recreates the way memories nestle in our minds, not as a linear sequence but as a kaleidoscope of images and feelings.

Related Content:

How Talking Heads and Brian Eno Wrote “Once in a Lifetime”: Cutting Edge, Strange & Utterly Brilliant

Classic Songs by Bob Dylan Re-Imagined as Pulp Fiction Book Covers: “Like a Rolling Stone,” “A Hard Rain’s A-Gonna Fall” & More

Watch Joan Baez Endearingly Imitate Bob Dylan (1972)

Hear Bob Dylan’s Newly-Released Nobel Lecture: A Meditation on Music, Literature & Lyrics

Ted Mills is a freelance writer on the arts who currently hosts the artist interview-based FunkZone Podcast and is the producer of KCRW's Curious Coast. You can also follow him on Twitter at @tedmills, read his other arts writing at tedmills.com and/or watch his films here.

Tangled Up in Blue: Deciphering a Bob Dylan Masterpiece is a post from: Open Culture. Follow us on Facebook, Twitter, and Google Plus, or get our Daily Email. And don't miss our big collections of Free Online Courses, Free Online Movies, Free eBooksFree Audio Books, Free Foreign Language Lessons, and MOOCs.

Quiet Earth: This Girl Don't Play with Dolls! ZILLA AND ZOE Trailer

Zoe, age 10, is obsessed with making horror films. A deadline for a big horror contest is coming up, and Zoe is determined to win. But when her father, concerned about her increasingly gory activities, orders her to stop making horror films and shoot her sister's wedding week instead, Zoe has no other choice... To win the contest of her dreams, she will have to turn her sister's wedding into a horror film.

After a successful film festival run, Zilla and Zoe is set for a wider theatrical run across the United States.


The film will play in several cities beginning May 17th when the film will be at the Laemmle theatre in North Hollywood, with filmmakers in attendance.


Zilla and Zoe is directed by Jessica Scalise.


Check out the trailer below:


[Continued ...]

Quiet Earth: IT CHAPTER TWO Trailer

Evil resurfaces in Derry as director Andy Muschietti reunites the Losers Club—young and adult—in a return to where it all began with “It Chapter Two.”

The film is Muschietti’s follow-up to 2017’s critically acclaimed and massive worldwide box office hit “IT,” which grossed over $700 million globally. Both redefining and transcending the genre, “IT” became part of the cultural zeitgeist as well as the highest-grossing horror film of all time.

Because every 27 years evil revisits the town of Derry, Maine, “It Chapter Two” brings the characters—who’ve long since gone their separate ways—back together as adults, nearly three decades after the events of the first film.

Oscar nominee Jessica Cha [Continued ...]

Colossal: Explore Mathematical Concepts Hands-On With a Paper Folding Kit by Kelli Anderson

Brooklyn-based designer Kelli Anderson (previously) continues to wow us with her inventive and interactive paper creations. Anderson’s Folding Paper DIY Kit builds on mathematical concepts to provide a hands-on way to learn about the shape-shifting possibilities of this everyday material. Each kit includes eight sheets of auxetic folding patterns along with instructions for each design: the Miura-ori fold, Ron Resch’s Square Twist, a modified version of the classic Waterbomb pattern, and an experimental Sequent fold. Fun fact: this kit was inspired by a workshop Anderson designed as part of Colossal’s exhibition Inflatable at the Exploratorium! You can find the Folding Paper DIY Kit in The Colossal Shop.

Open Culture: A Mesmerizing Trip Across the Brooklyn Bridge: Watch Footage from 1899

It’s hardly original advice but bears repeating anyway: no one visiting New York should leave, if they can help it, before they cross the Brooklyn Bridge—preferably on foot, if possible, and at a reverential pace that lets them soak up all the Neo-gothic structure’s storied history. Walk from Manhattan to Brooklyn, then back again, or the other away around, since that’s what the bridge was built for—the commutes of a nineteenth century bridge-but-not-yet-tunnel crowd (the first NYC subway tunnel didn’t open until 1908).

In 1899, filmmakers from American Mutoscope and Biograph elected for a mode of travel for a New York century, putting a camera at “the front end of a third rail car running at high speed,” notes a 1902 American Mutoscope catalogue. They accelerated the tour to the pace of a modern machine, chosing the Manhattan to Brooklyn route. “The entire trip consumes three minutes of time, during which abundant opportunity is given to observe all the structural wonders of the bridge, and far distant river panorama below.” (See one-third of the trip just below.)

Filmmaker Bill Morrison looped excerpts of those three New York minutes and extended them to nine in his short, stereoscopic journey “Outerborough,” at the top, commissioned by The Museum of Modern Art in 2005 and scored with original music by Todd Reynolds. Taking the 1899 footage as its source material, the film turns a rapid transit tour into a moving mandala, a fractal repetition at frighteningly faster and faster speeds, of the bridge’s most mechanical vistas—the views of its looming, vaulted arches and of the steel cage surrounding the tracks.

One of the engineering wonders of the world, the Brooklyn Bridge opened 136 years ago this month, on May 24th, 1883. The first person to walk across it was the woman who oversaw its construction for 11 of the 14 years it took to build the bridge. After designer John Roebling died of tetanus, his son Washington took over, only to succumb to the bends during the sinking of the caissons and spend the rest of his life bedridden. Emily, his wife, “took on the challenge,” notes the blog 6sqft, consulting with her husband while actively supervising the project.

She “studied mathematics, the calculations of catenary curves, strengths of materials and the intricacies of cable construction.” On its opening day, Emily walked the bridge’s 1,595 feet, from Manhattan to Brooklyn, “her long skirt billowing in the wind as she showed [the crowd] details of the construction,” writes David McCullough in The Great Bridge. Six days later, an accident caused a panic and a stampede that killed twelve people. Some months later, P.T. Barnum’s Jumbo led a parade of 21 elephants over the bridge in a stunt to prove its safety.

Barnum’s theatrics were surprisingly honest—the bridge may have needed selling to skeptical commuters, but it needed no hype. It outlived most of its contemporaries, despite the fact that it was built before engineers understood the aerodynamic properties of bridges. The Roeblings designed and built the bridge to be six times stronger than it needed to be, but no one could have foreseen just how durable the structure would prove.

It elicited a fascination that never waned for its palpable strength and beauty, yet fewer of its admirers chose to document the journey that has taken millions of Brooklynites over the river to lower Manhattan, by foot, bike, car, and yes, by train. Leave it to that futurist for the common man, Thomas Edison, to film the trip. See his 1899 footage of Brooklyn to Manhattan by train just above.

via Aeon

Related Content:

Immaculately Restored Film Lets You Revisit Life in New York City in 1911

An Online Gallery of Over 900,000 Breathtaking Photos of Historic New York City

New York City: A Social History (A Free Online Course from N.Y.U.) 

Josh Jones is a writer and musician based in Durham, NC. Follow him at @jdmagness

A Mesmerizing Trip Across the Brooklyn Bridge: Watch Footage from 1899 is a post from: Open Culture. Follow us on Facebook, Twitter, and Google Plus, or get our Daily Email. And don't miss our big collections of Free Online Courses, Free Online Movies, Free eBooksFree Audio Books, Free Foreign Language Lessons, and MOOCs.

Colossal: Bold Paper Quilled Artworks by JUDiTH + ROLFE Burst With Color and Character

Minnesota-based artistic collective JUDiTH + ROLFE, sculpt paper into voluptuous plant and flower motifs blossoming with movement and character. Featuring many botanical species including magnolias, irises and begonias, the duo’s work is a reminder of the diversity of plant structure and form. Each of their floral forms is ‘quilled’ into its shape, from the delicate veins making up the plant’s skeleton, to the fleshy petals exploding with color.

The duo’s business name JUDITH + ROLFE is derived from their middle names; and JUDITH (Daphne Lee) is the artist while her partner, ROLFE (Jamie Sneed), runs the business and logistics. “Before embarking on this journey as a paper artist, I worked for over a decade as an architect in New York City, which is also where I met my husband, Rolfe,” Lee tells Colossal.

Lee and Sneed were drawn to paper as a medium due to its availability and transformability: depending on light, shadows and perspective, their artworks change shape and form. “The technique I use most can broadly be called ‘quilling’ since I work with strips of paper and lay them on edge to form designs,” says Lee. Paper quilling is an artistic practice dating back to the 15th century, which was initially used to decorate religious objects. Basing her technique on the ancient craft, Lee gives her work a contemporary twist by creating big and bold pieces of single flowers or plants. In her process, Lee treats each strip of paper as its own line, from which she ‘sculpts’ her floral artworks. “The paper strips are glued individually to create the artwork, not unlike sketching with paper,” Lee explains. But unlike sketching with paper, Lee’s 3D artworks blossom out of their frame, mirroring the fragile flowers they resemble. 

To view more of JUDITH + ROLFE’s work, visit their website or their Instagram page.

BOOOOOOOM! – CREATE * INSPIRE * COMMUNITY * ART * DESIGN * MUSIC * FILM * PHOTO * PROJECTS: “Crowded Fields” by Photographer Pelle Cass

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass

 

 

Pelle Cass’s Website

Pelle Cass on Instagram

Open Culture: Discover Friedrich Nietzsche’s Curious Typewriter, the “Malling-Hansen Writing Ball” (Circa 1881)

During his final decade, Friedrich Nietzsche’s worsening constitution continued to plague the philosopher. In addition to having suffered from incapacitating indigestion, insomnia, and migraines for much of his life, the 1880s brought about a dramatic deterioration in Nietzsche’s eyesight, with a doctor noting that his “right eye could only perceive mistaken and distorted images.”

Nietzsche himself declared that writing and reading for more than twenty minutes had grown excessively painful. With his intellectual output reaching its peak during this period, the philosopher required a device that would let him write while making minimal demands on his vision.

So he sought to buy a typewriter in 1881. Although he was aware of Remington typewriters, the ailing philosopher looked for a model that would be fairly portable, allowing him to travel, when necessary, to more salubrious climates. The Malling-Hansen Writing Ball seemed to fit the bill:



In Dieter Eberwein’s free Nietzches Screibkugel e-book, the vice president of the Malling-Hansen Society explains that the writing ball was the closest thing to a 19th century laptop. The first commercially-produced typewriter, the writing ball was the 1865 creation of Danish inventor Rasmus Malling-Hansen, and was shown at the 1878 Paris Universal Exhibition to journalistic acclaim:

"In the year 1875, a quick writing apparatus, designed by Mr. L. Sholes in America, and manufactured by Mr. Remington, was introduced in London. This machine was superior to the Malling-Hansen writing apparatus; but the writing ball in its present form far excels the Remington machine. It secures greater rapidity, and its writing is clearer and more precise than that of the American instrument. The Danish apparatus has more keys, is much less complicated, built with greater precision, more solid, and much smaller and lighter than the Remington, and moreover, is cheaper."

Despite his initial excitement, Nietzsche quickly grew tired of the intricate contraption. According to Eberwein, the philosopher struggled with the device after it was damaged during a trip to Genoa; an inept mechanic trying to make the necessary repairs may have broken the writing ball even further. Still, Nietzsche typed some 60 manuscripts on his writing ball, including what may be the most poignant poetic treatment of typewriters to date:

“THE WRITING BALL IS A THING LIKE ME:

MADE OF IRON YET EASILY TWISTED ON JOURNEYS.

PATIENCE AND TACT ARE REQUIRED IN ABUNDANCE

AS WELL AS FINE FINGERS TO USE US."

In addition to viewing several of Nietzsche’s original typescripts at the Malling-Hansen Society website, those wanting a closer look at Nietzsche’s model can view it in the video below.

Note: This post originally appeared on our site in December 2013.

Ilia Blinderman is a Montreal-based culture and science writer. Follow him at @iliablinderman.

Related Content:

Mark Twain Wrote the First Book Ever Written With a Typewriter

The Keaton Music Typewriter: An Ingenious Machine That Prints Musical Notation

The Enduring Analog Underworld of Gramercy Typewriter

Discover Friedrich Nietzsche’s Curious Typewriter, the “Malling-Hansen Writing Ball” (Circa 1881) is a post from: Open Culture. Follow us on Facebook, Twitter, and Google Plus, or get our Daily Email. And don't miss our big collections of Free Online Courses, Free Online Movies, Free eBooksFree Audio Books, Free Foreign Language Lessons, and MOOCs.

BOOOOOOOM! – CREATE * INSPIRE * COMMUNITY * ART * DESIGN * MUSIC * FILM * PHOTO * PROJECTS: Giveaway: Ultimate Ears MEGABOOM 3 Speaker

BOOOOOOOM! – CREATE * INSPIRE * COMMUNITY * ART * DESIGN * MUSIC * FILM * PHOTO * PROJECTS: Artist Spotlight: Dylan Gebbia-Richards

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards

 

 

Dylan Gebbia-Richards’s Website

Dylan Gebbia-Richards on Instagram

Open Culture: What is Camp? When the “Good Taste of Bad Taste” Becomes an Aesthetic

Even if you don't care about high fashion or high society — to the extent that those two things have a place in the current culture — you probably glimpsed some of the coverage of what attendees wore to the Met Gala earlier this month. Or perhaps coverage isn't strong enough a word: what most of the many observers of the Metropolitan Museum of Art's Costume Institute annual fundraising gala did certainly qualified as analysis, and in not a few cases tipped over into exegesis. That enthusiasm was matched by the flamboyance of the clothing worn to the event — an event whose co-chairs included Lady Gaga, a suitable figurehead indeed for a party that this year took on the theme of camp.

But what exactly is camp? You can get an in-depth look at how the world of fashion has interpreted that elaborate and entertaining but nevertheless elusive cultural concept in the Met's show Camp: Notes on Fashion, which runs at the Met Fifth Avenue until early September.



"Susan Sontag's 1964 essay Notes on 'Camp' provides the framework for the exhibition," says the Met's web site, "which examines how the elements of irony, humor, parody, pastiche, artifice, theatricality, and exaggeration are expressed in fashion." But for a broader understanding of camp, you'll want to go back to Sontag's and read all of the 58 theses it nailed to the door of the mid-1960s zeitgeist.

According to Sontag, camp is "not a natural mode of sensibility" but a "love of the unnatural: of artifice and exaggeration." It offers a "way of seeing the world as an aesthetic phenomenon." Most anything manmade can be camp, and Sontag's list of examples include Tiffany lamps, "the Brown Derby restaurant on Sunset Boulevard in L.A.," Aubrey Beardsley drawings, and old Flash Gordon comics. Elevating style "at the expense of content," camp is suffused with "the love of the exaggerated, the 'off,' of things-being-what-they-are-not." Camp is not irony, but it "sees everything in quotation marks." The essential element of camp is "seriousness, a seriousness that fails." Camp "asserts that good taste is not simply good taste; that there exists, indeed, a good taste of bad taste."

"When Sontag published ‘Notes on Camp,’ she was fascinated by people who could look at cultural products as fun and ironic," says Sontag biographer Benjamin Moser in a recent Interview magazine survey of the subject. And though Sontag's essay remains the definitive statement on camp, not everyone has agreed on exactly what counts and does not count as camp in the 55 years since its publication in the Partisan Review"Camp to me means over-the-top humor, usually coupled with big doses of glamour," says fashion designer Jeremy Scott in the same Interview article. "To be interesting, camp has to have some kind of political consciousness and self-awareness about what it’s doing," says filmmaker Bruce Labruce, challenging Sontag's description of camp as apolitical.

And what will become of camp in the all-digitizing 21st century, when many eras increasingly coexist on the same culture plane? Our time “has cannibalized camp," says cultural history professor Fabio Cleto, "but to say that it’s no longer camp because its aesthetics have gone mainstream is an overly simplistic reading. Camp has always been mourning its own death.” Even so, some of camp's most high-profile champions have cast doubt on its viability. The phrase "good taste of bad taste" brings no figure to mind more quickly than Pink Flamingos and Hairspray director John Waters (who speaks on the origin of his good taste in bad taste in the Big Think video above). But even he speaks pessimistically to Interview about camp's future: "Camp? Nothing is so bad it’s good now that we have Trump as president. He even ruined that."

Related Content:

Why So Many People Adore The Room, the Worst Movie Ever Made? A Video Explainer

David Foster Wallace on What’s Wrong with Postmodernism: A Video Essay

The Star Wars Holiday Special (1978): It’s Oh So Kitsch

Susan Sontag’s 50 Favorite Films (and Her Own Cinematic Creations)

John Waters Talks About His Books and Role Models in a Whimsical Animated Video

Based in Seoul, Colin Marshall writes and broadcasts on cities, language, and culture. His projects include the book The Stateless City: a Walk through 21st-Century Los Angeles and the video series The City in Cinema. Follow him on Twitter at @colinmarshall or on Facebook.

What is Camp? When the “Good Taste of Bad Taste” Becomes an Aesthetic is a post from: Open Culture. Follow us on Facebook, Twitter, and Google Plus, or get our Daily Email. And don't miss our big collections of Free Online Courses, Free Online Movies, Free eBooksFree Audio Books, Free Foreign Language Lessons, and MOOCs.

Planet Haskell: Ken T Takusagawa: [efkabuai] Always have an escape plan

We consider a situation in which an internet site anticipates that it might disappear from its current address, perhaps because of censorship.  In preparation for potentially moving to a yet undecided location, we propose a method of preemptively publishing "digital breadcrumbs" (referencing Hansel and Gretel) on its current site as a way for users to find the next location if or when the site moves.  The site encourages users to save a local copy of the breadcrumbs.

The published digital breadcrumbs consist of two parts: a public key and the specification of a deterministic pseudorandom word generator including its random seed.

As an example, we'll specify digital breadcrumbs for this blog.

Here is the public key for this blog, with fingerprint 2C95 B41D A4CE 7C5F 0110 C27A 561D DCBA 1BDA E2F7:

-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1

mQENBFUrkNQBCADRc2ia1qpiS8wwrsqnPQpUoGY8DC1+tyRs6xGTqkkdxADLkS6f
VPuSkMktr0D/NmUmyCVSPkITYMeDlZ09eG2DIl33zS5ZpxTLgbHau8o8QB2cXw4f
ldwDrt5UmQwc8jF6vwKqoXyxPxJIb59fxCQ5s6llurnUI9MdlhDMyRQ0rFHkXu8G
JX+49zisWep7ZLZRT7/zdlKNlw2mriMTavOajCXtfR4WnFbQ8oYBkYLJPZFk4bi6
p4pyX9/nwcKCF2yIs7d3GqkYuuYSpp3gBdK+rAmYAj52cWEm08dtgurkDbh9rD/t
ykiDqHxh2oCou63Tnjt9qrCdjy7f0AstS7qZABEBAAG0PEtlbiAoaHR0cDovL2tl
bnRhLmJsb2dzcG90LmNvbSkgPGRldm51bGxAa2VudGEuYmxvZ3Nwb3QuY29tPokB
OAQTAQIAIgUCVSuQ1AIbAwYLCQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQVh3c
uhva4vddMQgAuDcuimghGXzhazv/S86oCfZ3vtwqh5aZzW8N3rz/0tB0o+hZjgCv
imu/N0m1Hv8IdFOexeHa6SgCuDg8xmRCSiFYgumDi6cQy9XCH4+mCfn5oiu1mmrg
leBnV4gRF0u5m7i4pzoBsdbRU0mmKUnRUV4KKkVEsOpZla48AOdkX4SaRGq8sPft
BRbUUoJf4/HVbZKLvJGqau270NbtHoM+AOe+Pk8X6AaPBl5vA6vep7zxRJayFiBm
mlxN6vU8FoH5sBYTdCrN84h0kQDtFszVoYXl1QEF0ek3LjlrEVJvXBJRGy4ZLnv5
juR7vPk1LhLcq28078ucnHo6Hh3uslFnHrkBDQRVK5DUAQgApksDkc2/iTCaFXbm
o1Ojb9VyOITquEzBjMMY/K58MuSKqw6X3PkzsIWVoUO1binlIWtoBHc8ooeWm2Ve
uhardx1SmpGE17UJZRwe+bPI6AXzGM2vFpa7JZTbFY90rgqIWOVrPbBL2+bZds54
ySX6xuVl70H5+NsJIqcqFH5bHoLQtzLfoqLQrIK4Azmv/2NP/nAVySECznyyQT0n
i25RNgiOjUwfKx2M4ICavQ/T1Of9YhnVSP3Cpz9+kDFV7VKiGCBKSSsahQbLOL3u
4dkG9QKY5vjU+dkjziZehHdNQrI9tMthCSopeLsUZ/ooQTj5IozdRTxVryxSbZKR
Tp0TDQARAQABiQEfBBgBAgAJBQJVK5DUAhsMAAoJEFYd3Lob2uL35CQH/RzXJVor
znGnWE/YPtb3qAFMit0APhOT0WWXCASJYjjzOJDMghjN70kqgPvmdjAL182wBVVz
pjEbZBtPDcEx9YYfKiRC3qCygiSjdKRLfSwHmzFeYzGM6DI4GIsJW+1q8+iA2DUs
lXLuyw8TJjH7jvY/cMhM6IzzdZQNnsaSZ9whCKM0t4yWDSxZ/cLk+ezPwEBJQmNR
+dOCvmLy3PbUlLOxN18gVezyFpHy0Bj6UGQ+hyTj2fp4tKiD9LF4iHsOYJdNcHrH
+CJJajhOKYmAB3RexlHunlLqAiLoGrcJUehi9hO4QnC52VvKa+2AryUKhQPPOZwa
iMz2Xdl91SRrAb8=
=mOe7
-----END PGP PUBLIC KEY BLOCK-----

Messages signed with a site's public key should contain the text Serial: NNN where NNN is a string in the format of a Debian software package version number, roughly epoch:version, where the epoch: is optional and version is a dotted number string.  (Inspired by the serial number of DNS zone records.  Some sites use serial numbers like 20190228 to ensure the serial number monotonically increases.  This would be nicer if dots were allowed: 2019.02.28.)  These serial numbers can be compared as version numbers as specified by Debian.  Having comparable serial numbers establishes a total ordering on messages, useful if later messages need to invalidate older messages.

Here is the first message signed with our public key, giving an example serial number.  This message is otherwise uselessly empty.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Serial: 0.2015.6
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEVAwUBVeD8B1Yd3Lob2uL3AQI1kwgAmUIKTr6rfffyceIsOL2UJ/Zw+bcdsdHt
xlzRVAQUocCZVEMhnloRN6bj2PsEMWtO9PxN4y46EBmVImjPYRTPa3FRdVoB7tL9
Lo/ExpoQ93tkz/zMrhC3siq2dwe2FxehqIU1diqEORT3FKs3D7Zbvb3qunifzihH
QoXOgEg6sQQzjKmUzMpwbt3SV86cpYLNE5GK9aaJLaYKf9yC7FQ9YnA8Dd/pVqBb
OCk4PHW/iRhzbPJhBHTLmL4hBH1rVIqZvXPq4msa13tHxQce+2owzoyCQD9PXPyx
TUFzDCJzoz7eDM12oyBbXRHvyWroT6khavrn/meaWofiKvB9ytAwpQ==
=ZyLY
-----END PGP SIGNATURE-----

We leave unsolved the standard hard problems of Public Key Infrastructure: what to do if your private key is compromised, what to do if you lose access to your private key.

We now turn our attention to the random word generator.

To combat yet unknown countermeasures an adversarial censor might deploy, the digital breadcrumbs provide only hints toward the many possibilities where the next site might be.  Exactly how generated random words specify the site's next location is deliberately left unstated in order to flexibly route around the yet unknown censorship: perhaps DNS, perhaps it should be used as a web search term, perhaps a key into some censorship-resistant medium.  It is the responsibility of the user, the recipient of the breadcrumbs, to try things and figure it out.  The infinite stream of random words provides many choices in case some of the earlier words are not available (perhaps due to censorship).  The public key allows the user to verify, perhaps even in an automated fashion, that he or she has found the right new site.

(Previous vaguely related idea: search the entire internet for the latest "message" signed by a public key.)

(Unfortunately, the random words won't directly be usable as addresses for things like Onion hidden services or Freenet CHK because for those, you don't have the ability to choose your own address.  Freenet KSK might be usable, at least until spammers overwrite it.)

A cryptographically secure pseudorandom number generator (PRNG) is not strictly necessary; however, we use one in the sample implementation below because cryptographic primitives are standardized so widely available and easily portable.

Two unrelated entities (perhaps even adversaries) can use exactly the same random word generator, perhaps because they chose the same seed.  We expect this is only to be a slight inconvenience because their different public keys can be used to distinguish them.  They can also choose different words in the same random stream.

Here is the random word generator for this blog, suggesting a template for others.  The random word generator is implemented in Haskell, similar to this stream cipher example.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

{- This code is public domain. -}
{- Serial: 0.2019.1 -}
{-# LANGUAGE PackageImports #-}
module Main where {
import "cipher-aes" Crypto.Cipher.AES as AES;
import qualified Data.ByteString as ByteString;
import Data.Word (Word8);
import Data.Maybe (catMaybes);
import qualified Crypto.Hash.SHA256 as SHA256;
import Data.Text.Encoding (encodeUtf8);
import qualified Data.Text as Text;
import qualified Data.List as List;

url :: String;
url = "http://kenta.blogspot.com";

main :: IO();
main = putStr $ catMaybes $ map filt27 $ to_bytes $ aes_stream
$ AES.initAES $ SHA256.hash $ encodeUtf8 $ Text.pack url;

aes_stream :: AES.AES -> [ByteString.ByteString];
aes_stream key = List.unfoldr (Just . (next_block key)) zero;

next_block :: AES.AES -> AES.AESIV ->
(ByteString.ByteString, AES.AESIV);
next_block key iv = AES.genCounter key iv 1; {- Although we ask
for just 1 byte, genCounter will return a full 16-byte block. -}

{- Initial counter value -}
zero :: AES.AESIV;
zero = AES.aesIV_ $ ByteString.replicate 16 0;

to_bytes :: [ByteString.ByteString] -> [Word8];
to_bytes = concat . map ByteString.unpack;

filt27 :: Word8 -> Maybe Char;
filt27 z = case mod z 32 of {
0 -> Just ' ';
x | x<=26 -> Just $ toEnum $ fromIntegral x + fromEnum 'a' - 1;
_ -> Nothing;
};
}
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEVAwUBXDtUc1Yd3Lob2uL3AQLb1QgAzeb/UebCk4cb0nSXdMSaSWwItxSbOvXK
qzBqE8EwCsg/uz3ry5MB24nFUd0puO9LqEy0okebCdZqj5qWdPK/PnLZj5Zx+ZG2
sUHNSe7pn6gfJfL9+JDoVLRJaJt2Cn/c4KUT2uC7Xsig6RAhKcIKMytCnU8jDG2P
S60Qdp3rk/GqgK6gHViTrLjckUAuV5nID+pxWzqE3Rx753w0W3wK/5f+giovWizk
CTuPkYGv7eFzO9zPN9tLo4na+MfaKskFpJ3PsAQFpJbcIM+RH80HNhJ06i0jjHuX
Rf2IcPnaBwtLYqjHK8WJ1dnhK2wHVLq2wD0/zNmCs1QPcm1Rbv9E6g==
=DL7c
-----END PGP SIGNATURE-----

Prose description of what the code does: The seed string http://kenta.blogspot.com is hashed with SHA-256, (yielding f78183140c842e8f4a550c3a5eb5663a33706fc07eeacc9687e70823d44511c4), which is then used (unsalted) as a key for AES-256 in counter mode (CTR) to generate a stream of bytes.  The counter starts at the zero block and increments as a 128-bit big-endian integer.  We read each AES output block as bytes in big-endian order.  Each byte is truncated to its least significant five bits (i.e., modulo 32), then encoded into letters by the following substitution: 0=space, 1=A, 2=B,... 26=Z.  Values 27 through 31 are ignored, encoding to nothing.  (They might be useful for future extensions.)  Occasionally there may be two or more consecutive spaces, a rare occurrence which currently has no special meaning but may be useful for future extensions.  For reference, our random word stream begins: d efkabuai yqwrmhspliasmvokhvwhvz fdvhdwhoxkcqfujilomqjubxfzjtug ...

This idea is similar to and kind of an elaboration of the Document ReFinding Key also deployed on this blog which helps in finding, via a search engine, new addresses of individual posts (for which you have an old URL) should this blog move.  Rather than associating just a single random word with an internet resource, we now associate an infinite stream of random words.  The template above can easily be extended to arbitrary URLs, such as those of individual posts.

The technical details were inspired by a mechanism used by some computer virus to call home and update itself, using a deterministic pseudorandom word generator to generate internet domains, one of which the virus author would anonymously purchase and post update code.

Penny Arcade: News Post: Too Fast, Too Furious

Tycho: I have been seized by a terrible illness, unfortunately not the kind of illness that would lend vigor to my upcoming mixtape.  No, it’s more the kind of illness associated with disease.  You might note that at the bottom of the strip, in the lower right hand corner, the name Amy is visible - this is a reference to Amy T Falcone, Strip Search alum and repper of the natural world, but also our good friend.  So good a friend, in fact, that she colored the strip on stream while Gabriel stuffed his mouth with hot chicken.  Thus was the credit earned. So this week has…

Planet Haskell: Tweag I/O: inline-js:seamless JavaScript/Haskell interop

Shao Cheng

Tweag.io has a bit of a history with language interop. By this point, we created or collaborated with others in the community on HaskellR, inline-c, inline-java, and now inline-js. The original idea for this style of interop was realized in language-c-inline by Manuel Chakravarty a few years before joining, concurrently to HaskellR. Manuel wrote a blog post about the design principles that underpin all these different libraries. Others in the community have since created similar libraries such as clr-inline, inline-rust and more. In this post, we'll present our latest contribution to the family: inline-js.

The tagline for inline-js: program Node.js from Haskell.

A quick taste of inline-js

Here is a quick demo of calling the Node.js DNS Promises API to resolve a domain:

import Data.Aeson
import GHC.Generics
import Language.JavaScript.Inline

data DNSRecord = DNSRecord
  { address :: String
  , family :: Int
  } deriving (FromJSON, Generic, Show)

dnsLookup :: String -> IO [DNSRecord]
dnsLookup hostname =
  withJSSession
    defJSSessionOpts
    [block|
        const dns = (await import("dns")).promises;
        return dns.lookup($hostname, {all: true});
    |]

To run it in ghci:

*Blog> dnsLookup "tweag.io"
[DNSRecord {address = "104.31.68.163", family = 4},DNSRecord {address = "104.31.69.163", family = 4},DNSRecord {address = "2606:4700:30::681f:44a3", family = 6},DNSRecord {address = "2606:4700:30::681f:45a3", family = 6}]

We can see that the A/AAAA records of tweag.io are returned as Haskell values.

This demo is relatively small, yet already enough to present some important features described below.

The QuasiQuoters

In the example above, we used block to embed a JavaScript snippet. Naturally, two questions arise: what content can be quoted, and what's the generated expression's type?

block quotes a series of JavaScript statements, and in-scope Haskell variables can be referred to by prefixing their names with $. Before evaluation, we wrap the code in a JavaScript async function, and this clearly has advantages against evaluating unmodified code:

  • When different blocks of code share a JSSession, the local bindings in one block don't pollute the scope of another block. And it's still possible to add global bindings by explicitly operating on global; these global bindings will persist within the same JSSession.

  • We can return the result back to Haskell any time we want; otherwise we'll need to ensure the last executed statement happens to be the result value itself, which can be tricky to get right.

  • Since it's an async function, we have await at our disposal, so working with async APIs becomes much more pleasant.

When we call dnsLookup "tweag.io", the constructed JavaScript code looks like this:

(async ($hostname) => {
  const dns = (await import("dns")).promises;
  return dns.lookup($hostname, {all: true});
})("tweag.io").then(r => JSON.stringify(r))

As we can see, the Haskell variables are serialized and put into the argument list of the async function. Since we're relying on FromJSON to parse the result in this case, the result of the async function is further mapped with JSON.stringify.

We also provide an expr QuasiQuoter when the quoted code is expected to be a single expression. Under the hood it adds return and reuse the implementation of block, to save a few keystrokes for the user.

Haskell/JavaScript data marshaling

The type of block's generated expression is JSSession -> IO r, with hidden constraints placed on r. In our example, we're returning [DNSRecord] which has a FromJSON instance, so that instance is picked up, and on the JavaScript side, JSON.stringify() is called automatically before returning the result back to Haskell. Likewise, since hostname is a String which supports ToJSON, upon calling dnsLookup, hostname is serialized to a JSON to be embedded in the JavaScript code.

For marshaling user-defined types, ToJSON/FromJSON is sufficient. This is quite convenient when binding a JavaScript function, since the ToJSON/FromJSON instances are often free due to Haskell's amazing generics mechanism. However, there are also a few other useful non-JSON types which are supported here. These non-JSON types are:

  • The ByteString types in the bytestring package, including strict/lazy/short versions. It's possible to pass a Haskell ByteString to JavaScript, which shows up as a Buffer. Going in the other direction works too.

  • The JSVal type which is an opaque reference to a JavaScript value, described in later sections of this post.

  • The () type (only as a return value), meaning that the JavaScript return value is discarded.

Ensuring the expr/block QuasiQuoters work with both JSON/non-JSON types involves quite a bit of type hackery, so we hide the relevant internal classes and it's currently not possible for inline-js users to add new such non-JSON types.

Importing modules & managing sessions

When prototyping inline-js, we felt the need to support the importing of modules, either built-in or user-supplied ones. Currently, there are two different import mechanisms coexisting in Node.js: the old CommonJS-style require() and the new ECMAScript native import. It's quite non-trivial to support both, and we eventually chose to support ECMAScript dynamic import() since it works out-of-the-box on both web and Node, making it more future-proof.

Importing a built-in module is straightforward: import(module_name) returns a Promise which resolves to that module's namespace object. When we need to import npm-installed modules, we need to specify their location in the settings to initialize JSSession:

import Data.ByteString (ByteString)
import Data.Foldable
import Language.JavaScript.Inline
import System.Directory
import System.IO.Temp
import System.Process

getMagnet :: String -> FilePath -> IO ByteString
getMagnet magnet filename =
  withSystemTempDirectory "" $ \tmpdir -> do
    withCurrentDirectory tmpdir $
      traverse_
        callCommand
        ["npm init --yes", "npm install --save --save-exact webtorrent@0.103.1"]
    withJSSession
      defJSSessionOpts {nodeWorkDir = Just tmpdir}
      [block|
        const WebTorrent = (await import("webtorrent")).default,
          client = new WebTorrent();

        return new Promise((resolve, reject) =>
          client.add($magnet, torrent =>
            torrent.files
              .find(file => file.name === $filename)
              .getBuffer((err, buf) => (err ? reject(err) : resolve(buf)))
          )
        );
    |]

Here, we rely on the webtorrent npm package to implement a simple BitTorrent client function getMagnet, which fetches the file content based on a magnet URI and a filename. First, we allocate a temporary directory and run npm install in it; then we supply the directory path in the nodeWorkDir field of session config, so inline-js knows where node_modules is. And finally, we use the webtorrent API to perform downloading, returning the result as a Haskell ByteString.

Naturally, running npm install for every single getMagnet call doesn't sound like a good idea. In a real world Haskell application which calls npm-installed modules with inline-js, the required modules shall be installed by the package build process, e.g. by using Cabal hooks to install to the package's data directory, and getMagnet can use the data directory as the working directory of Node.

Now, it's clear that all code created by the QuasiQuoters in inline-js requires a JSSession state, which can be created by newJSSession or withJSSession. There are a couple of config fields available, which allows one to specify the working directory of Node, pass extra arguments or redirect back the Node process standard error output.

How it works

Interacting with Node from Haskell

There are multiple possible methods to interact with Node in other applications, including in particular:

  • Whenever we evaluate some code, start a Node process to run it, and fetch the result either via standard output or a temporary file; persistent Node state can be serialized via structural cloning. This is the easiest way but also has the highest overhead.

  • Use pipes/sockets for IPC, with inline-js starting a script to get the code, perform evaluation and return results, reusing the same Node process throughout the session. This requires more work and has less overhead than calling Node for each call.

  • Use the Node.js N-API to build a native addon, and whatever Haskell application relying on inline-js gets linked with the addon, moving the program entry point to the Node side. We have ABI stability with N-API, and building a native addon is surely less troublesome than building the whole Node stack. Although the IPC overhead is spared, this complicates the Haskell build process.

  • Try to link with Node either as a static or dynamic library, then directly call internal functions. Given that the build system of Node and V8 is a large beast, we thought it would take a considerable amount of effort; even if it's known to work for a specific revision of Node, there's no guarantee later revisions won't break it.

The current implementation uses the second method listed above. inline-js starts an "eval server" which passes binary messages between Node and the host Haskell process via a pair of pipes. At the cost of a bit of IPC-related overhead, we make inline-js capable of working with multiple installations of Node without recompiling. The schema of binary messages and implementation of "eval server" is hidden from users and thus can evolve without breaking the exposed API of inline-js.

The "eval server"

The JavaScript specification provides the eval() function, allowing a dynamically constructed code string to be run anywhere. However, it's better to use the built-in vm module of Node.js, since it's possible to supply a custom global object where JavaScript evaluation happens, so we can prevent the eval server's declarations leaking into the global scope of the evaluated code, while still being able to add custom classes or useful functions to the eval server.

Once started, the eval server accepts binary requests from the host Haskell process and returns responses. Upon an "eval request" containing a piece of UTF-8 encoded JavaScript code, it first evaluates the code, expecting a Promise to be returned. When the Promise resolves with a final result, the result is serialized and returned. Given the asynchronous nature of this pipeline, it's perfectly possible for the Haskell process to dispatch a batch of eval requests, and the eval server to process them concurrently, therefore we also export a set of "async" APIs in Language.JavaScript.Inline which decouples sending requests and fetching responses.

On the Haskell side, we use STM to implement send/receive queues, and they are accompanied by threads which perform the actual sending/receiving. All user-facing interfaces either enqueues a request or tries to fetch the corresponding response from a TVar, blocked if the response is not ready yet. In this way, we make almost all exposed interfaces of inline-js thread-safe.

Marshaling data based on types

Typically, the JavaScript code sent to the eval server is generated by the QuasiQuoter's returned code, potentially including some serialized Haskell variables in the code, and the raw binary data included in the eval response is deserialized into a Haskell value. So how are the Haskell variables recognized in quoted code, and how does the Haskell/JavaScript marshaling take place?

To recognize Haskell variables, it's possible to simply use a simple regex to parse whatever token starting with $ and assume it's a captured Haskell variables, yet this introduces a lot of false positives, e.g. "$not_var", where $not_var is actually in a string. So in the QuasiQuoters of inline-js, we perform JavaScript lexical analysis on quoted code, borrowing the lexer in language-javascript. After the Haskell variables are found, the QuasiQuoters generate a Haskell expression including them as free variables, and at runtime, they can be serialized as parts of the quoted JavaScript code.

To perform type-based marshaling between Haskell and JavaScript data, the simplest thing to do is solely relying on aeson's FromJSON/ToJSON classes. All captured variables should have a ToJSON instance, serialized to JSON which is also a valid piece of ECMAScript, and whatever returned value should also have a FromJSON instance. However, there are annoying exceptions which aren't appropriate to recover from FromJSON/ToJSON instances.

One such types is ByteString. It's very important to be able to support Haskell ByteString variables and expect them to convert to Buffer on the Node side (or vice versa). Unfortunately, the JSON spec doesn't have a special variant for raw binary data. While there are other cross-language serialization schemes (e.g. CBOR) that support it, they introduce heavy npm dependencies to the eval server. Therefore, a reasonable choice is: expect inline-js users to solely rely on FromJSON/ToJSON for their custom types, while also supporting a few special types which have different serialization logic.

Therefore, we have a pair of internal classes for this purpose: ToJSCode and FromEvalResult. All ToJSON instances are also ToJSCode instances, while for ByteString, we encode it with base64 and generate an expression which recovers a Buffer and is safe to embed in any JavaScript code. The FromEvalResult class contains two functions: one to generate a "post-processing" JavaScript function that encodes the result to binary on the Node side, another to deserialize from binary on the Haskell side. For the instances derived from FromJSON, the "post-processing" code is r => JSON.stringify(r), and for ByteString it's simply r => r.

To keep the public API simple, ToJSCode/FromEvalResult is not exposed, and although type inference is quite fragile for QuasiQuoter output, everything works well as long as the relevant variables and return values have explicit type annotations.

Passing references to arbitrary JavaScript values

It's also possible to pass opaque references to arbitrary JavaScript values between Haskell and Node. On the Haskell side, we have a JSVal type to represent such references, and when the returned value's type is annotated to be a JSVal, on the Node side, we allocate a JSVal table slot for the result and pass the table index back. JSVal can also be included in quoted JavaScript code, and they convert to JavaScript expressions which fetch the indexed value.

Exporting Haskell functions to the JavaScript world

Finally, here's another important feature worth noting: inline-js supports a limited form of exporting Haskell functions to the JavaScript world! For functions of type [ByteString] -> IO ByteString, we can use exportHSFunc to get the JSVal corresponding to a JavaScript wrapper function which calls this Haskell function. When the wrapper function is called, it expects all parameters to be convertible to Buffer, then sends a request back to the Haskell process. The regular response-processor Haskell thread has special logic to handle them; it fetches the indexed Haskell function, calls it with the serialized JavaScript parameters in a forked thread, then the result is sent back to the Node side. The wrapper function is async and returns a Promise which resolves once the expected response is received from the Haskell side. Due to the async nature of message processing on both the Node and Haskell side, it's even possible for an exported Haskell function to call into Node again, and it also works the other way.

Normally, the JavaScript wrapper function is async, and async functions work nicely for most cases. There are corner cases where we need the JavaScript function to be synchronous, blocking when the Haskell response is not ready and returning the result without firing a callback. One such example is WebAssembly imports: the JavaScript embedding spec of WebAssembly doesn't allow async functions to be used as imports since this involves the "suspending" and "resuming" of WebAssembly instance state, which might be not economical to implement in today's JavaScript engines. Therefore, we also provide exportSyncHSFunc which makes a synchronous wrapper function to be used in such scenarios. Since it involves completely locking up the main thread in Node with Atomics, this is an extremely heavy hammer and should be used with much caution. We also lose reentrancy with this "sync mode"; when the exported Haskell function calls back into Node, the relevant request will be forever stuck in the message queue, freezing both the Haskell/Node process.

Summary

We've presented how inline-js allows JavaScript code to be used directly from Haskell, and explained several key aspects of inline-js internals. The core ideas are quite simple, and the potential use cases are potentially endless, given the enormous ecosystem the Node.js community has accumulated over the past few years. Even for development tasks that are not specifically tied to Node.js, it is still nice to have the ability to easily call relevant JavaScript libraries, to accelerate prototyping in Haskell and to compare correctness/performance of Haskell/JavaScript implementations.

There are still potential improvements to make, e.g. implementing type-based exporting of Haskell functions. But we decided that now is a good timing to announce the framework and collect some first-hand use experience, spot more bugs and hear user opinions on how it can be improved. When we get enough confidence from the feedback of seed users, we can prepare an initial Hackage release. Please spread the word, make actual stuff with inline-js and tell us what you think :)

OCaml Planet: An introduction to OCaml PPX ecosystem

These last few months, I spent some time writing new OCaml PPX rewriters or contributing to existing ones. It's a really fun experience. Toying around with the AST taught me a lot about a language I thought I knew really well. Turns out I actually had no idea what I was doing all these years.

All jokes aside, I was surprised that the most helpful tricks I learned while writing PPX rewriters weren't properly documented. There already exist a few very good introduction articles on the subject, like that 2014's article from Whitequark, this more recent one from Rudi Grinberg or even this last one from Victor Darvariu I only discovered after I actually started writing my own. I still felt like they were slighlty outdated or weren't answering all the questions I had when I started playing with PPX and writing my first rewriters.

I decided to share my PPX adventures in the hope that it can help others familiarize with this bit of the OCaml ecosystem and eventually write their first rewriters. The scope of this article is not to cover every single detail about the PPX internals but just to give a gentle introduction to beginners to help them get settled. That also means I might omit things that I don't think are worth mentioning or that might confuse the targetted audience but feel free to comment if you believe this article missed an important point.

It's worth mentioning that a lot of the nice tricks mentionned in these lines were given to me by a wonderful human being called Étienne Millon, thanks Étienne!

What is a PPX?

PPX rewriters or PPX-es are preprocessors that are applied to your code before passing it on to the compiler. They don't operate on your code directly but on the Abstract Syntax Tree or AST resulting from its parsing. That means that they can only be applied to syntactically correct OCaml code. You can think of them as functions that take an AST and return a new AST.

That means that in theory you can do a lot of things with a PPX, including pretty bad and cryptic things. You could for example replace every instance of true by false, swap the branches of any if-then-else or randomize the order of every pattern-matching case. Obviously that's not the kind of behaviour that we want as it would make it impossible to understand the code since it would be so far from the actual AST the compiler would get. In practice PPX-es have a well defined scope and only transform parts you explicitly annotated.

Understanding the OCaml AST

First things first, what is an AST. An AST is an abstract representation of your code. As the name suggests it has a tree-like structure where the root describes your entire file. It has children for each bits such as a function declaration or a type definition, each of them having their own children, for example for the function name, its argument and its body and that goes on until you reach a leaf such as a literal 1, "abc" or a variable for instance. In the case of OCaml it's a set of recursive types allowing us to represent OCaml code as an OCaml value. This value is what the parser passes to the compiler so it can type check and compile it to native or byte code. Those types are defined in OCaml's Parsetree module. The entry points there are the structure type which describes the content of an .ml file and the signature type which describes the content of an .mli file.

As mentionned above, a PPX can be seen as a function that transforms an AST. Writing a PPX thus requires you to understand the AST, both to interpret the one you'll get as input and to produce the right one as output. This is probably the trickiest part as unless you've already worked on the OCaml compiler or written a PPX rewriter, that will probably be the first time you two meet. Chances are also high that'll be a pretty bad first date and you will need some to time to get to know each others.

The Parsetree module documentation, is a good place to start. The above mentioned structure and signature types are at the root of the AST but some other useful types to look at at first are:

  • expression which describes anything in OCaml that evaluates to a value, the right hand side of a let binding for instance.
  • pattern which is what you use to deconstruct an OCaml value, the left hand side of a let binding or a pattern-matching case for example.
  • core_type which describes type expressions ie what you would find on the right hand side of a value description in a .mli, ie val f : <what_goes_there>.
  • structure_item and signature_item which describe the top level AST nodes you can find in a structure or signature such as type definitions, value or module declarations.

Thing is, it's a bit a rough and there's no detailed explanation about how a specific bit of code is represented, just type definitions. Most of the time, the type, field, and variant names are self-explanatory but it can get harder with some of the more advanced language features. It turns out there are plenty of comments that are really helpful in the actual parsetree.mli file and that aren't part of the generated documentation. You can find them on github but I personally prefer to have it opened in a VIM tab when I work on a PPX so I usually open ~/.opam/<current_working_switch>/lib/ocaml/compiler-libs/parsetree.mli.

This works well while exploring but you might also want a more straightforward approach to discovering what the AST representation is for some specific OCaml code. The ppx_tools opam package comes with a dumpast binary that pretty prints the AST for any given piece of valid OCaml code. You can install it using opam:

$opam install ppx_tools

and then run it using ocamlfind:

$ocamlfind ppx_tools/dumpast some_file.ml

You can use it on .ml and .mli files or to quickly get the AST for an expression with the -e option:

$ocamlfind ppx_tools/dumpast -e "1 + 1"

Similarly, you can use the -t or -p options to respectively pretty prints ASTs from type expressions or patterns.

Using dumpast to get both the ASTs of a piece of code using your future PPX and the resulting preprocessed code is a good way to start and will help you figure out what are the steps required to get there.

Note that you can use the compiler or utop have a similar feature with the -dparsetree flag. Running ocamlc/ocamlopt -dparsetree file.ml will pretty print the AST of the given file while running utop -dparsetree will pretty print the AST of the evaluated code alongside it's evaluation. I tend to prefer the pretty printed AST from dumpast but any of these tools will prove helpful in understanding the AST representation of a given piece of OCaml code.

Language extensions interpreted by PPX-es

OCaml 4.02 introduced syntax extensions meant to be used by external tools such as PPX-es. Knowing their syntax and meaning is important to understand how most of the existing rewriters work because they usually look for those language extenstions in the AST to know which part of it they need to modify.

The two language extensions we're interested in here are extension nodes and attributes. They are defined in detail in the OCaml manual (see the attributes and extension nodes sections) but I'll try to give a good summary here.

Extension nodes are used in place of expressions, module expressions, patterns, type expressions or module type expressions. Their syntax is [%extension_name payload]. We'll come back to the payload part a little later. You can also find extension nodes at the top level of modules or module signatures with the syntax [%%extension_name payload]. Hopefully the following cheatsheet can help you remember the basics of how and where you can use them:

type t =
  { a : int
  ; b : [%ext pl]
  }

let x =
  match 1 with
  | 0 -> [%ext pl]
  | [%ext pl] -> true

[%%ext pl]

Because extension nodes stand where regular AST nodes should, the compiler won't accept them and will give you an Uninterpreted extension error. Extension nodes have to be expanded by a PPX for your code to compile.

Attributes are slightly different although their syntax is very close from extensions. Attributes are attached to existing AST nodes instead of replacing them. That means that they don't necessarily need to be transformed and the compiler will ignore unknown attributes by default. They can come with a payload just like extensions and use @ instead of %. The number of @ preceding the attribute name specifies which kind of node they are attached to:

let a = 12 [@attr pl]

let b = "some string" [@@attr pl]

[@@@attr pl]

In the first example, the attribute is attached to the expression 12 while in the second example it is attached to the whole let b = "some string" value binding. The third one is of a slightly different nature as it is a floating attribute. It's not attached to anything per-se and just ends up in the AST as a structure item. Because there is a wide variety of nodes to which you can attach attributes, I won't go too far into details here but a good rule of thumb is that you use @@ attributes when you want them attached to structure or signature items, for anything deeper within the AST structure such as patterns, expressions or core types, use the single @ syntax. Looking at the Parsetree documentation can help you figure out where you can find attributes.

Now let's talk about those payloads I mentioned earlier. You can think of them as "arguments" to the extension points and attributes. You can pass different kinds of arguments and the syntax varies for each of them:

let a = [%ext expr_or_str_item] 
let b = [%ext: type_expr_or_sig_item]
let c = [%ext? pattern]

As suggested in the examples, you can pass expressions or structure items using a space character, type expressions or signature items (anything you'd find at the top level of a module signature) using a : or a pattern using a ?.

Attributes' payload use the same syntax:

let a = 'a' [@attr expr_or_str_item]
let b = 'b' [@attr: type_expr_or_sig_item]
let a = 'a' [@attr? pattern]

Some PPX-es rely on other language extensions such as the suffix character you can attach to int and float litterals (10z could be used by a PPX to turn it into Z.of_string "10" for instance) or quoted strings with a specific identifier ({ppx_name|some quoted string|ppx_name} can be used if you want your PPX to operate on arbitrary strings and not only syntactically correct OCaml) but attributes and extensions are the most commonly used ones.

Attributes and extension points can be expressed using an infix syntax. The attribute version is barely used but some forms of the infix syntax for extension points are used by popular PPX-es and it is likely you will encounter some of the following:

let infix_let_extension =
  let%ext x = 2 in
  ...

let infix_match_extension =
  match%ext y with ...

let infix_try_extension =
  try%ext f z with _ -> ...

which are syntactic sugar for:

let infix_let_extension =
  [%ext let x = 2 in ...]

let infix_match_extension =
  [%ext match y with ...]

let infix_try_extension =
  [%ext try f z with _ -> ...]

A good example of a PPX making heavy use of these if lwt_ppx. The OCaml manual also contains more examples of the infix syntax in the Attributes and Extension points sections mentionned above.

The two main kind of PPX-es

There is a wide variety of PPX rewriters but the ones you'll probably see the most are Extensions and Derivers.

Extensions

Extensions will rewrite tagged parts of the AST, usually extension nodes of the form [%<extension_name> payload]. They will replace them with a different AST node of the same nature ie if the extension point was located where an expression should be, the rewriter will produce an expression. Good examples of extensions are:

  • ppx_getenv2 which replaces [%getenv SOME_VAR] with the value of the environment variable SOME_VAR at compile time.
  • ppx_yojson which allows you to write Yojson values using OCaml syntax to mimic actual json. For instance you'd use [%yojson {a = None; b = 1}] to represent {"a": null, "b": 1} instead of the Yojson's notation: Assoc [("a", Null); ("b", Int 1)].

Derivers

Derivers or deriving plugins will "insert" new nodes derived from type definitions annotated with a [@@deriving <deriver_name>] attribute. They have various applications but are particularly useful to derive functions that are tedious and error prone to write by hand such as comparison functions, pretty printers or serializers. It's really convenient as you don't have to update those functions every time you update your type definitions. They were inspired by Haskell Type classes. Good examples of derivers are:

  • ppx_deriving itself comes with a bunch of deriving plugins such as eq, ord or show which respectively derives, as you might have guessed, equality, comparison and pretty-printing functions.
  • ppx_deriving_yojson which derives JSON serializers and deserializers.
  • ppx_sexp_conv which derives s-expressions converters.

Derivers often let you attach attributes to specify how some parts of the AST should be handled. For example when using ppx_deriving_yojson you can use [@default some_val] to make a field of an object optional:

type t =
  { a: int
  ; b: string [@default ""]
  }
[@@deriving of_yojson]

will derive a deserializer that will convert the JSON value {"a": 1} to the OCaml {a = 1; b = ""}

How to write a PPX using ppxlib

Historically there was a few libraries used by PPX rewriter authors to write their PPX-es, including ppx_tools and ppx_deriving but as the eco-system evolved, ppxlib emerged and is now the most up-to-date and maintained library to write and handle PPX-es. It wraps the features of those libraries in a single one. I encourage you to use ppxlib to write new PPX-es as it is also easier to make various rewriters work together if they are all registered through ppxlib and the PPX ecosystem would gain from being unified around a single PPX library and driver.

It is also a great library and has some really powerful features to help you write your extensions and derivers.

Writing an extension

The entry point of ppxlib for extensions is Ppxlib.Extension.declare. You have to use that function to build an Extension.t, from which you can then build a Context_free.Rule.t before registering your transformation so it's actually applied.

The typical my_ppx_extension.ml will look like:

open Ppxlib

let extension =
  Extension.declare
    "my_extension"
    some_context
    some_pattern
    expand_function

let rule = Context_free.Rule.extension extension

let () =
  Driver.register_transformation ~rules:[rule] "my_transformation"

To compile it as PPX rewriter you'll need to put the following in your dune file:

(library
 (public_name my_ppx)
 (kind ppx_rewriter)
 (libraries ppxlib))

Now let's go back a little and look at the important part:

let extension =
  Extension.declare
    "my_extension"
    some_context
    some_pattern
    expand_function

Here "my_extension" is the name of your extension and that define how you're going to invoke it in your extension point. In other words, to use this extension in our code we'll use a [%my_extension ...] extension point.

some_context is a Ppxlib.Extension.Context.t and describes where this extension can be found in the AST, ie can you use [%my_extension ...] as an expression, a pattern, a core type. The Ppxlib.Extension.Context module defines a constant for each possible extension context which you can pass as some_context. This obviously means that it also describes the type of AST node to which it must be converted and this property is actually enforced by the some_pattern argument. But we'll come back to that later.

Finally expand_function is our actual extension implementation, which basically takes the payload, a loc argument which contains the location of the expanded extension point, a path argument which is the fully qualified path to the expanded node (eg. "file.ml.A.B") and returns the generated code to replace the extension with.

Ast_pattern

Now let's get back to that some_pattern argument.

This is one of the trickiest parts of ppxlib to understand but it's also one its most powerful features. The type for Ast_pattern is defined as ('a, 'b, 'c) t where 'a is the type of AST nodes that are matched, 'b is the type of the values you're extracting from the node as a function type and 'c is the return type of that last function. This sounded really confusing to me at first and I'm guessing it might do to some of you too so let's give it a bit of context.

Let's look at the type of Extension.declare:

val declare :
  string ->
  'context Context.t ->
  (payload, 'a, 'context) Ast_pattern.t ->
  (loc:Location.t -> path:string -> 'a) ->
  t

Here, the expected pattern first type parameter is payload which means we want a pattern that matches payload AST nodes. That makes perfect sense since it is used to describe what your extension's payload should look like and what to do with it. The last type parameter is 'context which again seems logical. As I mentioned earlier our expand_function should return the same kind of node as the one where the extension was found. Now what about 'a. As you can see, it describes what comes after the base loc and path parameters of our expand_function. From the pattern point of view, 'a describes the parts of the matched AST node we wish to extract for later consumption, here by our expander.

Ast_pattern contains a whole bunch of combinators to let you describe what your pattern should match and a specific __ pattern that you must use to capture the various parts of the matched nodes. __ has type ('a, 'a -> 'b, 'b) Ast_pattern.t which means that whenever it's used it changes the type of consumer function in the returned pattern.

Let's consider a few examples to try wrapping our heads around this. Say I want to write an extension that takes an expression as a payload and I want to pass this expression to my expander so I can generate code based on its value. I can declare the extension like this:

let extension =
  Extension.declare
    "my_extension"
    Extension.Context.expression
    Ast_pattern.(single_expr_payload __)
    expand_function

In this example, Extension.Context.expresssion has type expression Extension.Context.t, the pattern has type (payload, expression -> expression, expression) Ast_pattern.t. The pattern says we want to allow a single expression in the payload and capture it. If we decompose it a bit, we can see that single_expr_payload has type (expression, 'a, 'b) Ast_pattern.t -> (payload, 'a, 'b) Ast_pattern.t and is passed __ which makes it a (expression, expression -> 'b, 'b) Ast_pattern.t and that's exactly what we want here as our expander will have type loc: Location.t -> path: string -> expression -> expression!

It works similarly to Scanf.scanf when you think about it. Changing the pattern changes the type of the consumer function the same way changing the format string does for Scanf functions.

This was a bit easy since we had a custom combinator just for that purpose so let's take a few more complex examples. Now say we want to only allow pairs of integer and string constants expressions in our payload. Instead of just capturing any expression and dealing with the error cases in the expand_function we can let Ast_pattern deal with that and pass an int and string along to our expander:

Ast_pattern.(single_expr_payload (pexp_tuple ((eint __)^::(estring __)^::nil)))

This one's a bit more elaborate but the idea is the same, we use __ to capture the int and string from the expression and use combinators to specify that the payload should be made of a pair and that gives us a: (payload, int -> string -> 'a, 'a) Ast_pattern.t which should be used with a loc: Location.t -> path: string -> int -> string -> expression expander.

We can also specify that our extension should take something else than an expression as a payload, say a pattern with no when clause so that it's applied as [%my_ext? some_pattern_payload]:

Ast_pattern.(ppat __ none)

or no payload at all and it should just be invoked as [%my_ext]:

Ast_pattern.(pstr nil)

You should play with Ast_pattern a bit if you need to express complex patterns as I think it's the only way to get the hang of it.

Writing a deriver

Registering a deriver is slightly different from registering an extension but in the end it remains relatively simple and you will still have to provide the actual implementation in the form of an expand function.

The typical my_ppx_deriver.ml will look like:

open Ppxlib

let str_type_decl_generator =
  Deriving.Generator.make_no_arg
    ~attributes
    expand_str

let sig_type_decl_generator =
  Deriving.Generator.make_no_arg
    ~attributes
    expand_sig

let my_deriver =
  Deriving.add
    ~str_type_decl:str_type_decl_generator
    ~sig_type_decl:sig_type_decl_generator
    "my_deriver"

Which you'll need to compile with the following library stanza:

(library
 (public_name my_ppx)
 (kind ppx_deriver)
 (libraries ppxlib))

The Deriving.add function is declared as:

val add
  :  ?str_type_decl:(structure, rec_flag * type_declaration list) Generator.t
  -> ?str_type_ext :(structure, type_extension                  ) Generator.t
  -> ?str_exception:(structure, extension_constructor           ) Generator.t
  -> ?sig_type_decl:(signature, rec_flag * type_declaration list) Generator.t
  -> ?sig_type_ext :(signature, type_extension                  ) Generator.t
  -> ?sig_exception:(signature, extension_constructor           ) Generator.t
  -> ?extension:(loc:Location.t -> path:string -> core_type -> expression)
  -> string
  -> t

It takes a mandatory string argument, here "my_deriver", which defines how user are going to invoke your deriver. In this case we'd need to add a [@@deriving my_deriver] to a type declaration in a structure or a signature to use it. Then there's just one optional argument per kind of node to which you can attach a [@@deriving ...] attribute. type_decl correspond to type = ..., type_ext to type += ... and exception to exception My_exc of .... You need to provide generators for the ones you wish your deriver to handle, ppxlib will make sure users get a compile error if they try to use it elsewhere. We can ignore the extension as it's just here for compatibility with ppx_deriving.

Now let's take a look at Generator. Its type is defined as ('output_ast, 'input_ast) t where 'input_ast is the type of the node to which the [@@deriving ...] is attached and 'output_ast the type of the nodes it should produce, ie either a structure or a signature. The type of a generator depends on the expand function it's built from when you use the smart constructo make_no_arg meaning the expand function should have type loc: Location.t -> path: string -> 'input_ast -> 'output_ast. This function is the actual implementation of your deriver and will generate the list of structure_item or signature_item from the type declaration.

Compatibility with ppx_import

ppx_import is a PPX rewriter that lets you import type definitions and spares you the need to copy and update them every time they change upstream. The main reason why you would want to do that is because you need to derive values from those types using a deriver thus the importance of ensuring your deriving plugin is compatible.

Let's take an example to illustrate how ppx_import is used. I'm using a library called blob which exposes a type Blob.t. For some reason I need to be able to serialize and deserialize Blob.t values to JSON. I'd like to use a deriver to do that as I don't want to maintain that code myself. Imagine Blob.t is defined as:

type t =
  { value : string
  ; length : int
  ; id : int
  }

Without ppx_import I would define somewhere a serializable_blob type as follows:

type serializable_blob = Blob.t =
  { value : string
  ; length : int
  ; id : int
  }
[@@deriving yojson]

That works well especially because the type definition is simple but I don't really care about having it here, what I really want is just the to_yojson and of_yojson functions. Also now, if the type definition changes, I have to update it here manually. Maintaining many such imports can be tedious and duplicates a lot of code unecessarily.

What I can do instead, thanks to ppx_import is to write it like this:

type serializable_blob = [%import: Blob.t]
[@@deriving yojson]

which will ultimately be expanded into the above using Blob's definition of the type t.

Now ppx_import works a bit differently from regular PPX rewriters as it needs a bit more information than just the AST. We don't need to understand how it works but what it means is that if your deriving plugin is used with ppx_import, it will be called twice:

  • A first time with ocamldep. This is required to determine the dependencies of a module in terms of other OCaml modules. PPX-es need to be applied here to find out about dependencies they may introduce.
  • A second time before actually compiling the code.

The issue here is that during the ocamldep pass, ppx_import doesn't have the information it needs to import the type definition yet so it can't copy it and it expands:

type u = [%import A.t]

into:

type u = A.t

Only during the second pass will it actually expand it to the copied type definition.

This may be a concern if your deriving plugin can't apply to abstract types because you will probably raise an error when encountering one, meaning the first phase will fail and the whole compilation will fail without giving your rewriter a chance to derive anything from the copied type definition.

The right way to deal with this is to have different a behaviour in the context of ocamldep. In this case you can ignore such type declaration or eventually, if you know you are going to inject new dependencies in your generated code, to create dummy values referencing them and just behave normally in any other context.

ppxlib versions 0.6.0 and higher allow you to do so through the Deriving.Generator.V2 API which passes an abstract ctxt value to your expand function instead of a loc and a path. You can tell whether it is the ocamldep pass from within the expand function like this:

open Ppxlib

let expand ~ctxt input_ast =
  let omp_config = Expansion_context.Deriver.omp_config ctxt in
  let is_ocamldep_pass = String.equal "ocamldep" omp_config.Migrate_parsetree.Driver.tool_name in
  ...

Deriver attributes

You'll have noted the attributes parameter in the examples. It's an optional parameter that lets you define which attributes your deriver allows the user to attach to various bits of the type, type extension or exception declaration it is applied to.

ppxlib comes with a Attribute module that lets you to properly declare the attributes you want to allow and make sure they are properly used: correctly spelled, placed and with the right payload attached. This is especially useful since attributes are by default ignored by the compiler meaning without ppxlib's care, plugin users wouldn't get any errors if they misused an attribute and it might take them a while to figure out they got it wrong and the generated code wasn't impacted as they hoped. The Attribute module offers another great feature: Attribute.t values can be used to extract the attribute payload from an AST node if it is present. That will spare you the need for inspecting attributes yourself which can prove quite tedious.

Ppxlib.Attribute.t is defined as ('context, 'payload) t where 'context describes to which node the attribute can be attached and 'payload, the type of its payload. To build such an attribute you must use Ppxlib.Attribute.declare:

val declare
  :  string
  -> 'a Context.t
  -> (payload, 'b, 'c) Ast_pattern.t
  -> 'b
  -> ('a, 'c) t

Let's try to declare the default argument from ppx_deriving_yojson I mentioned earlier.

The first string argument is the attribute name. ppxlib support namespaces for the attributes so that users can avoid conflicting attributes between various derivers applied to the same type definitions. For instance here we could use "default". It can prove helpful to use more qualified name such as "ppx_deriving_yojson.of_yojson.default". That means that our attribute can be used as [@@default ...], [@@of_yojson.default ...] or [@@ppx_deriving.of_yojson.default ...]. Now if another deriver uses a [@@default ...], users can apply both derivers and provide different default values to the different derivers by writing:

type t =
  { a : int
  ; b : string [@make.default "abc"] [@of_yojson.default ""]
  }
[@@deriving make,of_yojson]

The context argument works very similarly to the one in Extension.declare. Here we want the attribute to be attached to record field declarations so we'll use Attribute.Context.label_declaration which has type label_declaration Attribute.Context.t.

The pattern argument is an Ast_pattern.t. Now that we know how to work with those this is pretty easy. Here we need to accept any expression as a payload since we should be able to apply the default attribute to any field, regardless of its type and we want to extract that expression from the payload so we can use it in our deserializer so let's use Ast_pattern.(single_expr_payload __).

Finally the last 'b argument has the same type as the pattern consumer function. We can use it to transform what we extracted using the previous Ast_pattern but in this case we just want to keep the expression as we got it so we'll just use the identity function here.

We end up with the following:

let default_attribute =
  Attribute.declare
    "ppx_deriving_yojson.of_yojson.default"
    Attribute.Context.label_declaration
    Ast_pattern.(single_expr_payload __)
    (fun expr -> expr)

and that gives us a (label_declaration, expression) Attribute.t.

You can then use it to collect the attribute payload from a label_declaration:

Attribute.get default_attribute label_decl

which will return Some expr if the attribute was attached to label_decl or None otherwise.

Because of their polymorphic nature, attributes need to be packed, ie to be wrapped with a variant to hide the type parameter, so if you want to pass it to Generator.make_no_arg you'll have to do it like this:

let attributes = [Attribute.T default_attribute]

Writing your expand functions

In the two last sections I mentioned expand functions that would contain the actual deriver or extension implementation but didn't actually said anything about how to write those. It will depend a lot on the purpose of your PPX rewriter and what you're trying to achieve.

Before writing your PPX you should clearly specify what it should be applied to and what code it should produce. That will help you declaring the right deriving or extension rewriter and from there you'll know the type of the expand functions you have to write which should help.

A good way to proceed is to use the dumpast tool to pretty print the AST fragments of both the input of your expander and the output, ie the code it should generate. To take a concrete example, say you want to write a deriving plugin that generates an equal function from a type definition. You can start by running dumpast on the following file:

type some_record =
  { a : int64
  ; b : string
  }

let equal_some_record r r' = Int64.equal r.a r'.a && String.equal r.b r'.b

That will give you the AST representation of a record type definition and the equal function you want to write so you can figure out how to deconstruct your expander's input to be able to generate the right output.

ppxlib exposes smart constructors in Ppxlib.Ast_builder.Default to help you build AST fragments without having to care too much attributes and such fields as well as some convenience constructors to keep your code concise and readable.

Another convenience tool ppxlib exposes to help you build AST fragments is metaquot. I recently wrote a bit of documentation about it here which you should take a look at but to sum it up metaquot is a PPX extension allowing you to write AST nodes using the OCaml syntax they describe instead of the AST types.

Handling code locations in a PPX rewriter

When building AST fragments you should keep in mind that you have to set their location. Locations are part of the AST values that describes the position of the corresponding node in your source file, including the file name and the line number and offset of both the beginning and the end the code bit they represent.

Because your code was generated after the file was parsed, it doesn't have a location so you need to set it yourself. One could think that it doesn't matter and we could use a dummy location but locations are used by the compiler to properly report errors and that's why a PPX rewriter should care about how it locates the generated code as it will help the end user to understand whether the error comes from their code or generated code and how to eventually fix it.

Both Ast_builder and metaquot expect a location. The first explicitly takes it as a labelled loc argument while the second relies on a loc value being available in the scope. It is important to set those with care as errors in the generated code doesn't necessarily mean that your rewriter is bugged. There are valid cases where your rewriter functioned as intended but the generated code triggers an error. PPX-es often work on the assumption that some values are available in the scope, if the user doesn't properly provide those it's their responsibility to fix the error. To help them do so, it is important to properly locate the generated code to guide them as much as possible.

When writing extensions, using the whole extension point location for the generated code makes perfect sense as that's where the code will sit. That's fairly easy as this what ppxlib passes to the expand function through the loc labelled argument. For deriving plugins it's a bit different as the generated code doesn't replace an existing part of the parsed AST but generate a new one to insert. Currently ppxlib gives you the loc of the whole type declaration, extension or exception declaration your deriving plugin is applied to. Ideally it would be nice to be able to locate the generated code on the plugin name in the deriving attribute payload, ie here:

[@@deriving my_plugin,another_plugin]
            ^^^^^^^^^

I'm currently working on making that location available to the expand function. In the meantime, you should choose a convention. I personally locate all the generated code on the type declaration. Some choose to locate the generated code on the part of the input AST they're handling when generating it.

Reporting errors to your rewriter users

You won't always be able to handle all the AST nodes passed to your expand functions, either because the end user misused your rewriter or because there are some cases you simply can't deal with.

In those cases you can report the error to the user with Ppxlib.Location.raise_errorf. It works similarly to printf and you can build your error message from a format string and extra arguments. It will then raise an exception which will be caught and reported by the compiler. A good practice is to prefix the error message with the name of your rewriter to help users understand what's going on, especially with deriving plugin as they might use several of them on the same type declaration.

Another point to take care of here is, again, locations. raise_errorf takes a labelled loc arguments. It is used so that your error is reported as any compiler error. Having good locations in those error messages is just as important as sending clear error messages. Keep in mind that both the errors you report yourself or errors coming from your generated code will be highlighted by merlin so when properly set they make it much easier to work with your PPX rewriter.

Testing your PPX

Just as most pieces of code do, a PPX diserves to be tested and it has become easier over the years to test rewriters.

I personally tend to write as many unit test as possible for my PPX-es internal libraries. I try to extract helper functions that can easily be unit-tested but I can't test it all that way. Testing the ast -> ast functions would be tedious as ppxlib and ocaml-migrate-parsetree don't provide comparison and pretty printing functions that you can use with alcotest or oUnit. That means you'd have to import the AST types and derive them on your own. That would make a lot of boiler plate and even if those functions were exposed, writing such tests would be really tedious. There's a lot of things to take into account. How are you going to build the input AST values for instance? If you use metaquot, every node will share the same loc, making it hard to test that your errors are properly located. If you don't, you will end up with insanely long and unreadable test code or fixtures. While that would allow extremely accurate testing for the generated code and errors, it will almost certainly make your test code unmaintainable, at least given the current tooling.

Don't panic, there is a very good and simple alternative. ppxlib makes it very easy to build a binary that will parse OCaml code, preprocess the AST with your rewriter and spit it out, formatted as code again.

You just have to write the following pp.ml:

let () = Ppxlib.Driver.standalone ()

and build the binary with the following dune stanza, assuming your rewriter is called my_ppx_rewriter:

(executable
 (name pp)
 (modules pp)
 (libraries my_ppx_rewriter ppxlib))

Because we're humans and the OCaml syntax is meant for us to write and read, it makes for much better test input/output. You can now write your test input in a regular .ml file, use the pp.exe binary to "apply" your preprocessor to it and compare the output with another .ml file containing the code you expect it to generate. This kind of test pattern is really well supported by dune thanks to the diff user action.

I usually have the following files in a rewriter/deriver folder within my test directory:

test/rewriter/
├── dune
├── test.expected.ml
├── pp.ml
└── test.ml

Where pp.ml is used to produce the rewriter binary, test.ml contains the input OCaml code and test.expected.ml the result of preprocessing test.ml. The dune file content is generally similar to this:

(executable
 (name pp)
 (modules pp)
 (libraries my_ppx_rewriter ppxlib))

(rule
 (targets test.actual.ml)
 (deps (:pp pp.exe) (:input test.ml))
 (action (run ./%{pp} -deriving-keep-w32 both --impl %{input} -o %{targets})))

(alias
 (name runtest)
 (action (diff test.expected.ml test.actual.ml)))

(test
  (name test)
  (modules test)
  (preprocess (pps my_ppx_rewriter)))

The first stanza is the one I already introduced above and specifies how to build the rewriter binary.

The rule stanza that comes after that indicates to dune how to produce the actual test output by applying the rewriter binary to test.ml. You probably noticed the -deriving-keep-w32 both CLI option passed to pp.exe. By default, ppxlib will generate values or add attributes so that your generated code doesn't trigger a "Unused value" warning. This is useful in real life situation but here it will just polute the test output and make it harder to read so we disable that feature.

The following alias stanza is where all the magic happens. Running dune runtest will now generate test.actual.ml and compare it to test.expected.ml. It will not only do that but show you how they differ from each other in a diff format. You can then automatically update test.expected.ml if you're happy with the results by running dune promote.

Finally the last test stanza is there to ensure that the generated code compiles without type errors.

This makes a very convenient test setup to write your PPX-es TDD style. You can start by writing an identity PPX, that will just return its input AST as it is. Then you add some OCaml code using your soon to be PPX in test.ml and run dune runtest --auto-promote to prefill test.expected.ml. From there you can start implementing your rewriter and run dune runtest to check on your progress and update the expected result with dune promote. Going pure TDD by writing the test works but it's tricky cause you'd have to format your code the same way pp.exe will format the AST. It would be great to be able to specify how to format the generated test.actual.ml so that this approach would be more viable and the diff more readable. Being able to use ocamlformat with a very diff friendly configuration would be great there. pp.exe seems to offer CLI options to change the code style such as -styler but I haven't had the chance to experiment with those yet.

Now you can test successful rewriting this way but what about errors? There's a lot of value ensuring you produce the right errors and on the right code location because that's the kind of things you can get wrong when refactoring your rewriter code or when people try to contribute. That isn't as likely to happen if your CI yells when you break the error reporting. So how do we do that?

Well pretty much the exact same way! We write a file with an erroneous invocation of our rewriter, run pp.exe on it and compare stderr with what we expect it to be. There are two major differences here. First we want to collect the stderr output of the rewriter binary instead of using it to generate a file. The second is that we cant write all of our test cases in a single file since pp.exe will stop at the first error. That means we need one .ml file per error test case. Luckily for us, dune offers ways to do both.

For every error test file we will want to add the following stanzas:

(rule
  (targets test_error.actual)
  (deps (:pp pp.exe) (:input test_error.ml)) 
  (action
    (with-stderr-to
      %{targets}
      (bash "./%{pp} -no-color --impl %{input} || true")
    )
  )
)

(alias
  (name runtest)
  (action (diff test_error.expected test_error.actual))
)

but obviously we don't want to do that by hand every time we add a new test case so we're gonna need a script to generate those stanzas and then include them into our dune file using (include dune.inc).

To achieve that while keeping things as clean as possible I use the following directory structure:

test/rewriter/
├── errors
│   ├── dune
│   ├── dune.inc
│   ├── gen_dune_rules.ml
│   ├── pp.ml
│   ├── test_some_error.expected
│   ├── test_some_error.ml
│   ├── test_some_other_error.expected
│   └── test_some_other_error.ml
├── dune
├── test.expected.ml
├── pp.ml
└── test.ml

Compared to our previous setup, we only added the new errors folder. To keep things simple it has its own pp.ml copy but in the future I'd like to improve it a bit and be able to use the same pp.exe binary.

The most important files here are gen_dune_rules.ml and dune.inc. The first is just a simple OCaml script to generate the above stanzas for each test cases in the errors directory. The second is the file we'll include in the main dune. It's also the file to which we'll write the generated stanza.

I personally use the following gen_dune_rules.ml:

let output_stanzas filename =
  let base = Filename.remove_extension filename in
  Printf.printf
    {|
(library
  (name %s)
  (modules %s)
  (preprocess (pps ppx_yojson))
)

(rule
  (targets %s.actual)
  (deps (:pp pp.exe) (:input %s.ml))
  (action
    (with-stderr-to
      %%{targets}
      (bash "./%%{pp} -no-color --impl %%{input} || true")
    )
  )
)

(alias
  (name runtest)
  (action (diff %s.expected %s.actual))
)
|}
    base
    base
    base
    base
    base
    base

let is_error_test = function
  | "pp.ml" -> false
  | "gen_dune_rules.ml" -> false
  | filename -> Filename.check_suffix filename ".ml"

let () =
  Sys.readdir "."
  |> Array.to_list
  |> List.sort String.compare
  |> List.filter is_error_test
  |> List.iter output_stanzas

Nothing spectacular here, we just build the list of all the .ml files in the directory except pp.ml and gen_dune_rules.ml itself and then generate the right stanzas for each of them. You'll note the extra library stanza which I add to get dune to generate the right .merlin so that I can see the error highlights when I edit the files by hand.

With that we're almost good, add the following to the dune file and you're all set:

(executable
  (name pp)
  (modules pp)
  (libraries
    ppx_yojson
    ppxlib
  )
)

(include dune.inc)

(executable
  (name gen_dune_rules)
  (modules gen_dune_rules)
)

(rule
  (targets dune.inc.gen)
  (deps
    (:gen gen_dune_rules.exe)
    (source_tree .)
  )
  (action
    (with-stdout-to
      %{targets}
      (run %{gen})
    ) 
  )
)

(alias
  (name runtest)
  (action (diff dune.inc dune.inc.gen))
)

The first stanza is here to specify how to build the rewriter binary, same as before, while the second stanza just tells dune to include the content of dune.inc within this dune file.

The interesting part comes next. As you can guess the executable stanza builds our little OCaml script into a .exe. The rule that comes after that sepcifies how to generate the new stanzas by running gen_dune_rules and capturing its standard output into a dune.inc.gen file. The last rule allows you to review the changes to the generated stanza and use promotion to accept them. Once this is done, the new stanzas will be included to the dune file and the test will be run for every test cases.

Adding a new test case is then pretty easy, you can simply run:

$ touch test/rewriter/errors/some_explicit_test_case_name.{ml,expected} && dune runtest --auto-promote

That will create the new empty test case and update the dune.inc with the corresponding rules. From there you can proceed the same way as with the successful rewriting tests, update the .ml, run dune runtest to take a sneak peek at the output and dune promote once you're satisfied with the result.

I've been pretty happy with this setup so far although there's room for improvement. It would be nice to avoid duplicating pp.ml for errors testing. This also involves quite a bit of boilerplate that I have to copy into all my PPX rewriters repositories every time. Hopefully dune plugins should help with that and I can't wait for a first version to be released so that I can write a plugin to make this test pattern more accessible and easier to set up.

Quiet Earth: WATCHMEN Trailer: HBO Tackles A Graphic Novel Classic

Tick tock. Watchmen debuts this fall on HBO.

From Damon Lindelof, Watchmen is a modern-day re-imagining of Alan Moore's groundbreaking graphic novel about masked vigilantes. Zack Snyder was the last to bring it to vivid life on the big screen.

The cast includes: Regina King, Jeremy Irons, Don Johnson, Jean Smart, Tim Blake Nelson, Louis Gossett Jr., Yahya Abdul-Mateen II, Hong Chau, Andrew Howard, Tom Mison, Frances Fisher, Jacob Ming-Trent, Sara Vickers, Dylan Schombing, and James Wolk


Check out the first teaser trailer below:




Recommended Release: Watchmen [Continued ...]

Quiet Earth: Cinepocalypse 2019 Program: VEROTIKA Directorial Debut of Glenn Danzig to Open

Cinepocalypse, Chicago’s premiere festival for electrifying and provocative genre cinema, returns to the Music Box Theatre June 13th for eight days of features, shorts, events and surprises, including eight fantastic break-out world premieres.

ALTER, the horror brand from Gunpowder & Sky that curates, develops and distributes award-winning short films, series, and feature films is the official sponsor of the fest this year.

Cinepocalypse is also supported by Orion, Bloody-Disgusting.com, Shudder, Fangoria & Vinegar Syndrome.

The 2019 poster was designed by visionary heavy metal artist Paul Romano, whose work can be found on hundreds of album covers and frequently collaborates with famed force of musical destruction Mastodon.


Festival badges are available now, and in [Continued ...]

Open Culture: Discover Frida Kahlo’s Wildly-Illustrated Diary: It Chronicled the Last 10 Years of Her Life, and Then Got Locked Away for Decades

When we admire a famous artist from the past, we may wish to know everything about their lives—their private loves and hates, and the inner worlds to which they gave expression in canvases and sculptures. A biography may not be strictly necessary for the appreciation of an artist’s work. Maybe in some cases, knowing too much about an artist can make us see the autobiographical in everything they do. Frida Kahlo, on the other hand, fully invited such interpretation, and made knowing the facts of her life a necessity.

She can hardly "be accused of having invented her problems," writes Deborah Solomon at The New York Times, yet she invented a new visual vocabulary for them, achieving her mostly posthumous fame “by making her unhappy face the main subject of her work.”



Her “specialty was suffering”—her own—“and she adopted it as an artistic theme as confidently as Mondrian claimed the rectangle or Rubens the corpulent nude.” Kahlo treated her life as worthy a subject as the respectable middle-class still lifes and aristocratic portraits of the old masters. She transfigured herself into a personal language of symbols and surreal motifs.

This means we must peer as closely into Kahlo’s life as we are able if we want to fully enter into what Museum of Modern Art curator Kirk Varnedoe called “her construction of a theater of the self.” But we may not feel much closer to her after reading her wildly-illustrated diary, which she kept for the last ten years of her life, and which was locked away after her death in 1954 and only published forty years later, with an introduction by Mexican novelist Carlos Fuentes. The diary was then republished by Abrams in a beautiful hardcover edition that retains Fuentes’ introduction.

If you’re looking for a historical chronology or straightforward narrative, prepare for disappointment. It is, writes Kathryn Hughes at The Telegraph, a diary “of a very particular kind. There are few dates in it, and it has nothing to say about events in the external world—Communist Party meetings, appointments at the doctor’s or even trysts with Diego Rivera, the artist whom Kahlo loved so much that she married him twice. Instead it is full of paintings and drawings that appear to be dredged from her fertile unconscious.”

This descriptions suggests that the diary substitutes the image for the word, but this is not so—it is filled with Kahlo’s experiments with language: playful prose-poems, witty and cryptic captions, free-associative happy accidents. Like the visual autobiography of kindred spirit Jean-Michel Basquiat, her private feelings must be inferred from documents in which image and word are inseparable. There are “neither startling disclosures,” writes Solomon, “nor the sort of mundane, kitchen-sink detail that captivates by virtue of its ordinariness.” Rather than exposition, the diary is filled, as Abrams describes it, with "thoughts, poems, and dreams... along with 70 mesmerizing watercolor illustrations."

Kahlo’s diary allows for no “dreamy identification with its subject” notes Solomon, through Instagram-worthy summaries of her dinners or wardrobe woes. Unlike her many, gushing letters to Rivera and other lovers, the “irony is that these personal sketches are surprisingly impersonal.” Or rather, they express the personal in her preferred private language, one we must learn to read if we want to understand her work. More than any other artist of the time, she turned biography into mythology.

Knowing the bare facts of her life gives us much-needed context for her images, but ultimately we must deal with them on their own terms as well. Rather than explaining her painting to us, Kahlo’s diary opens up an entirely new world of imagery—one very different from the controlled self-portraiture of her public body of work—to puzzle over.

Related Content:

Frida Kahlo’s Passionate Love Letters to Diego Rivera

A Brief Animated Introduction to the Life and Work of Frida Kahlo

Visit the Largest Collection of Frida Kahlo’s Work Ever Assembled: 800 Artifacts from 33 Museums, All Free Online

Josh Jones is a writer and musician based in Durham, NC. Follow him at @jdmagness

Colossal: A Literal Translation Lends a Daring Edge to the First Meal of the Day

Although breakfast is commonly consumed in a rush out the door, or slurped hurriedly before one dashes to catch the bus, the early morning meal’s straightforward composition of actions is often not considered. Madrid-based photographer Tessa Dóniga created the series Break/Fast after becoming intrigued by the deconstructed word’s literal translation to Spanish. Smashed cereal, a sliced bean can, and a quickly melting stick of butter all serve as subjects of the surreal photographic series, which highlight the different ways breakfast can be “broken.”

The project was first realized in collaboration with independent journal Polpettas and serves as a metaphor for how Dóniga views the world as a bilingual speaker. “The fact that I’m bilingual makes me wonder more,” she told gestalten. “When I try to translate some words into one language from another, I question myself. My challenge was to set in one image both terms in a visual composition that would be recognizable to the viewer.”

Like all of Dóniga’s uniquely styled series, Break/Fast was creating from scratch with editing in postproduction for some of her more high-flying effects such as hovering bacon or scattered eggshells. You can see more of her food styling photography on her studio’s website and on Instagram.

OCaml Planet: A course on homotopy (type) theory

This semester my colleague Jaka Smrekar and I are teaching a graduate course on homotopy theory and homotopy type theory. The first part was taught by Jaka and was a nice review of classical homotopy theory leading up to Quillen model categories. In the second part I am covering basic homotopy type theory. The course … Continue reading A course on homotopy (type) theory

MattCha's Blog: 2003 Yuanjiutang Ban Zhang Thoughts on the Plum Essence Taste


I got this complimentary sample from Tiago of Tea Encounter (the cake sells for $124.95 for 350g cake or $0.36/g) in a care package a few months ago.  An interesting thing about this puerh is that it has been stored in the drier more Northern Heibei province of China…

Dry leaves are small, likely plantation, and quite tippy.  They have the odour of strawberry yogurt with slight wood.

First infusion has a slightly sour almost fruit like onset, barely dusty old puerh taste before it arrives at a mild almost minty peak.  It then becomes slightly creamy and sweet.  A strawberry fruit note comes to mind here.  The sweetness is long and creamy.  There are pops of various fruit tastes in the aftertaste as well as a bubble gum flavouring.  The overall feeling and tea liquor suggest pretty dry storage.

The second infusion starts off just mildly astringent like the skin of a grape with slight dusty aged note before it becomes barely minty and cool with a nice bubble gum taste in the aftertaste.  The mouthfeel is slightly tight and astringent.  The storage is nicely dry.  The long candy finish is uninterrupted by other nuances and is long on the breath.  The qi is pooling in the head and the body feel relaxed, the joints nimble.  The qi is sedating and tranquil as things slow down around me.

The third infusion starts a touch astringent, dusty, almost sour grape peel before it goes into a short mild mint and long distinct candy breath.  The bubble gum sweetness is long and obvious on the breath.  The mouthfeel is slightly tight and tart.  The throat feels open at the mid-level.  This puerh is not overly complex but rather pure and obvious in its presentation of grape, dust, mint, and bubble gum progression. Almost a red wine like taste and odour in there as well.

The fourth infusion starts off almost legume like, almost coffee or coco mild bitter and oak barrel then transitions to mint and long clean bubble gum finish.  The Qi is real nice I feel a bit euphoric now with my bubble gum breath.  There is a oak aged wine like taste to this, almost alcohol.  The dry storage finish is nice.

The fifth has an almost merlot onset that is simple and pure almost oak barrel then barely/ not really minty then long bubble gum.  The merlot taste is throughout.  The profile is real simple, crisp, nice very dry and compressed storage.  There is some faint floral and fruit that is hard to pin below the surface.   The mouthfeel is slightly astringent and sticky on the lips.

The sixth infusion has a nice oak barrel wine merlot taste, more woody now with a very faint lingering creamy sweetness.  The mouthfeel is a touch tart. Sweet bubble gum finish.

The seventh infusion has a velvety mouthfeeling now, a touch dry.  There is a subtle smoke oak taste in these initial tastes that melt well with mild grape tastes, then mint then a creamy sweetness appears more the bubble gum.  The sweet taste is uninterrupted and long.

The eighth and ninth infusions have an oak barrel merlot taste to it then eases into a long bubble gum sweetness.  Qi feeling is gently sedating here.  The sweetness on the breath is long.  The flavours are simple and enjoyable and clear.

The tenth has a slight pungent medical wood touch under oaked wine, mint, and creamy sweetness. The eleventh is back to the same profile.  The 12th shows signs of slight medicinal again along with the other tastes. The simple consistent profile is evince of single estate.  The mouthfeel is slight astringent almost velvety.  The tea liquor is slightly thinner but consistent broth.

The thirteenth is similar in profile the grape skin of early infusions is gone and there is a touch of medicinal tastes in there the finish becomes merlot like.  The profile is simple and enjoyable.

In the fourteenth the initial taste is dry wood and slight cherry fruit there is some very mild cooling mint then a long merlot wine sweetness.  The fifteenth is much the same with more of a creamy sweet finish than wine.  The slight bit of astringency in this puerh gives it some substance despite the thinner, dry stored broth.  It feels very pure in the body, the Qi seems to wear off by mid-session.

The next few infusions are dry wood with slight wine profile and long slight sweet finish.  I decide to steep it a few more times.

I enjoy this simple, clean, pure, example of dry storage.  The clean merlot taste and long bubble gum aftertaste is quite enjoyable.

The long dry storage bubble gum sweetness is also described in China as the plum taste but I have tasted plum before and it really is more like bubble gum or cotton candy too me so I usually describe it this way.  A nice long dry machine compressed storage is sure to bring about this note in a reliable puerh.  The last cake I purchased with this note and similar long dry storage was a cake of the sold out dry Shanghai stored 2003 CNNP Small Green Mark Iron Cake from Yunnan Sourcing  which also has this characteristic.  This 2003 Ban Zhang is a bit different in its profile but for those who enjoyed that cake and storage I would recommend this one.

So the question remains: Is this really from the famous Ban Zhang area?  Well, it’s always possible but also probably more unlikely than possible.  Part of these older “Ban Zhang” are just the fun of imagining that they may in fact be from somewhere in or near Ban Zhang… most likely not Lao Ban Zhang but it’s always possible that it is some plantation fixture from nearby, I suppose.   Hahahah…

Certainly this one is much more possible than some Six Famous Mountains Factory Banzhang cakes I have or my cakes from Laomane Menghai Banzhang Factory.  This 2003 is far far more enjoyable that a comparison to these factory cakes doesn’t do this Yuanjiutang any justice!

I actually blind sampled this one that was labeled “2003 Banzhang” and appraised it at $200-300 for 357g cake.  So in my eyes this cake is at least a 30% value.

Peace

Planet Haskell: FP Complete: Maximizing Haskell Webinar Review

We are happy to announce that we have been sponsoring free webinars for over a year now. The feedback we have been receiving from the IT community has been overwhelmingly positive. We have been working towards producing a new webinar topic every month, and we plan to keep moving at that pace. In this webinar, Michael Snoyman Vice President of Engineering at FP Complete discusses how to maximize the "Haskell Success Program ". We had 189 people registered for the event which aired on Wednesday, May 1st at 10:00 am PDT. 

Interested In

Enrolling for FP Complete's
Haskell Success Program?

ENROLL NOW!
 

Haskell Special Offer

About the Webinar


In this month's webinar, Michael Snoyman, demonstrated just how easy it is to get started with maximizing FP Complete's "Haskell Success Program"

Topics covered:

During the webinar we tried to answer these questions:

  • Why you should use Haskell
  • The benefits of Haskell in a commercial setting
  • How you can get your team to adopt more Haskell
  • About the training resources we provide online
  • How the FP Complete team can help your company succeed with Haskell

Interested In

Learning more from FP Complete's
Haskell Success Program?

LEARN MORE!


Watch the Webinar

 

We decided to include the chat log for this webinar, and it can be seen at the end of this blog post.

We have a winner!

Haskell Success Program Winning Draw is...

Congratulations Lauri Lättilä of Helsinki, Finland's SimAnalytics! Lauri & SimAnalytics have won FP Complete's $1 Haskell Success Program Drawing. FP Complete looks forward to working together with Lauri and his SimAnalytics' Haskell team.

NEW! 2 Success Programs!

Following the great feedback and success of its new Haskell Success Program, FP Complete would like to take this opportunity to introduce and welcome its 2 new Success Programs!

BlockChain Success Program

Bringing world-class engineering to your rapid blockchain projects.

Engineer to engineer blockchain mentoring that saves time, slashes risks, and pays for itself!

Be The First

Learn more about this exciting new opportunity available for  the success of your team!

ENROLL NOW!

DevOps Success Program

Accelerate your team's expertise in cloud tools & automation 

Mentoring, teaching, and collaboration customized to your team's needs

Engineer To Engineer

Fixing broken processes with DevOps. Engineer to engineer mentoring that pays for itself! 

 ENROLL NOW!

Do You Know FP Complete?

At FP Complete, we do so many things to help companies it's hard to encapsulate our impact in a few words.  They say a picture is worth a thousand words, so a video has to be worth 10,000 words (at least). Therefore, to tell all we can in as little time as possible, check out our explainer video. It's only 108 seconds to get the full story of FP Complete. 


We want your feedback for webinar topics

We would like to hear your suggestions for future webinar topics. The simplest way to accomplish this is to add a comment to this blog post with your suggestion. Alternatively, send your suggestion via email to socialmedia@fpcomplete.com.

Webinar Chat Log

We find it useful to share what was chatted about during the webinar. You can see the chat flow below.

00:06:56 Yanik Koval: hey
00:07:08 Chris Done: hey!
00:07:37 Yanik Koval: Will you record this?
00:07:51 Chris Done: yes, it's being recorded
00:08:02 Chris Done: it'll be uploaded to our YouTube channel
00:08:15 Yanik Koval: perfect, thank you
01:09:31 Do you mean 20% of Haskell gives 80% of the benefits? Is that just a type?
01:16:30 Bruce Alspaugh: I see languages like Java have lots of free or inexpensive MOOCs available, but it is hard to find very many on platforms like EdX, Coursera, Linda, etc. but there are very few for Haskell. Is there any effort underway to expand these offerings?
01:19:22 : I have seen some university/academic lecturers giving talks about teaching Haskell via MOOCs, but I'm not aware of commercial initiatives in this area.
https://glasgowmoocadventures.wordpress.com/

01:24:45 Bruce Alspaugh: Do you have any recommendations for how to work with existing local user groups, or establishing new Haskell-oriented user groups to encourage developers to learn Haskell, and companies to start pilot projects?
01:35:13 Bruce Alspaugh: I see languages like Java have lots of free or inexpensive MOOCs available on platforms like EdX, Coursera, Lynda, etc., but there are very few available for Haskell. Is there any effort underway to expand these offerings?
01:39:51 Bruce Alspaugh: Is Haskell on the JVM using Eta or Frege a viable option?
01:41:04 Byron Hale: To All Panelists : My experience with Python is that it's like an elephant in a china-shop when it comes to package management. A carefully curated system will be overwritten without a by-your-leave. Python is the preferred language for machine-learning. What can be done?
01:46:03 Bruce Alspaugh: Does Haskell have good libraries for PDF report generation?
01:47:07 Byron Hale: For example, can Python be added to Nix? Guix?
01:59:21 Dan Banta: Thank you.
01:59:36 Chris Done: Thanks all!!
01:59:53 Bulent Basaran: Thanks You!

Colossal: Minimalist Modular Systems Turn Walls Into Feline Playgrounds

Mike Wilson and Megan Hanneman, founders of CatastrophiCreations, design modular wall-mounted systems to keep cats active. Parents of humans and pets alike (myself included) are all too familiar with the trip hazard of toys scattered on the floor. Wilson and Hanneman move the activity zone to the wall with vertical playgrounds that allow cats to climb, jump, scratch, and even tip-toe across swinging bridges. Eschewing bright colors and plastic materials, the designers use solid wood, hidden brackets, and canvas to create more subtle and sustainable products. You can learn more about the the Grand Rapids, Michigan-based business in an interview and factory tour on Etsy’s blog. Check out their range of products, from the Thunderdome to the Temple Complex, in their online store.

Penny Arcade: Comic: Too Fast, Too Furious

New Comic: Too Fast, Too Furious

BOOOOOOOM! – CREATE * INSPIRE * COMMUNITY * ART * DESIGN * MUSIC * FILM * PHOTO * PROJECTS: Illustrator Spotlight: Eric Hosford

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford

 

 

Eric Hosford’s Website

Eric Hosford on Instagram

BOOOOOOOM! – CREATE * INSPIRE * COMMUNITY * ART * DESIGN * MUSIC * FILM * PHOTO * PROJECTS: Illustrator Spotlight: Lulu Lin

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin

 

 

Lulu Lin on Instagram

Explosm.net: Comic for 2019.05.08

New Cyanide and Happiness Comic

Planet Haskell: Donnacha Oisín Kidney: Some Tricks for List Manipulation

Posted on May 8, 2019
Tags: Haskell

This post is a collection of some of the tricks I’ve learned for manipulating lists in Haskell. Each one starts with a puzzle: you should try the puzzle yourself before seeing the solution!

The Tortoise and the Hare

How can you split a list in half, in one pass, without taking its length?

This first one is a relatively well-known trick, but it occasionally comes in handy, so I thought I’d mention it. The naive way is as follows:

But it’s unsatisfying: we have to traverse the list twice, and we’re taking its length (which is almost always a bad idea). Instead, we use the following function:

The “tortoise and the hare� is the two arguments to go: it traverses the second one twice as fast, so when it hits the end, we know that the first list must be halfway done.

There and Back Again

Given two lists, xs and ys, write a function which zips xs with the reverse of ys (in one pass).

There’s a lovely paper (Danvy and Goldberg 2005) which goes though a number of tricks for how to do certain list manipulations “in reverse�. Their technique is known as “there and back again�. However, I’d like to describe a different way to get to the same technique, using folds.

Whenever I need to do some list manipulation in reverse (i.e., I need the input list to be reversed), I first see if I can rewrite the function as a fold, and then just switch out foldr for foldl.

For our puzzle here, we need to first write zip as a fold:

If that looks complex, or difficult to write, don’t worry! There’s a systematic way to get to the above definition from the normal version of zip. First, let’s start with a normal zip:

Then, we need to turn it into a case-tree, where the first branch is on the list we want to fold over. In other words, we want the function to look like this:

To figure out the cases, we factor out the cases in the original function. Since the second clause (zip xs [] = []) is only reachable when xs /= [], it’s effectively a case for the x:xs branch.

Now, we rewrite the different cases to be auxiliary functions:

And finally, we refactor the recursive call to the first case expression.

Then those two auxiliary functions are what you pass to foldr!

So, to reverse it, we simply take wherever we wrote foldr f b, and replace it with foldl (flip f) b:

Of course, we’re reversing the wrong list here. Fixing that is simple:

Maintaining Laziness

Rewrite the above function without using continuations.

zipRev, as written above, actually uses continuation-passing style. In most languages (including standard ML, which was the one used in Danvy and Goldberg (2005)), this is pretty much equivalent to a direct-style implementation (modulo some performance weirdness). In a lazy language like Haskell, though, continuation-passing style often makes things unnecessarily strict.

Consider the church-encoded pairs:

So it’s sometimes worth trying to avoid continuations if there is a fast direct-style solution. (alternatively, continuations can give you extra strictness when you do want it)

First, I’m going to write a different version of zipRev, which folds on the first list, not the second.

Then, we inline the definition of foldl:

Then, as a hint, we tuple up the two accumulating parameters:

What we can see here is that we have two continuations stacked on top of each other. When this happens, they can often “cancel out�, like so:

And we have our direct-style implementation!

Manual Fusion

Detect that a list is a palindrome, in one pass.

We now know a good way to split a list in two, and a good way to zip a list with its reverse. We can combine the two to get a program that checks if a list is a palindrome. Here’s a first attempt:

But this is doing three passes!

To get around it, we can manually do some fusion. Fusion is a technique where we can spot scenarios like the following:

And translate them into a version without a list:

The trick is making sure that the consumer is written as a fold, and then we just put its f and b in place of the : and [] in the producer.

So, when we inline the definition of splitHalf into zipRev, we get the following:

(adding a special case for odd-length lists)

Finally, the all (uncurry (==)) is implemented as a fold also. So we can fuse it with the rest of the definitions:

You may have spotted the writer monad over All there. Indeed, we can rewrite it to use the monadic bind:

Eliminating Multiple Passes with Laziness

Construct a Braun tree from a list in linear time.

This is also a very well-known trick (Bird 1984), but today I’m going to use it to write a function for constructing Braun trees.

A Braun tree is a peculiar structure. It’s a binary tree, where adjacent branches can differ in size by only 1. When used as an array, it has <semantics>�(logn)<annotation encoding="application/x-tex">\mathcal{O}(\log n)</annotation></semantics> lookup times. It’s enumerated like so:

     ┌─7
   ┌3┤
   │ └11
 ┌1┤
 │ │ ┌─9
 │ └5┤
 │   └13
0┤
 │   ┌─8
 │ ┌4┤
 │ │ └12
 └2┤
   │ ┌10
   └6┤
     â””14

The objective is to construct a tree from a list in linear time, in the order defined above. Okasaki (1997) observed that, from the list:

Each level in the tree is constructed from chucks of powers of two. In other words:

From this, we can write the following function:

The first place we’ll look to eliminate a pass is the build function. It combines two rows by splitting the second in half, and zipping it with the first.

We don’t need to store the length of the first list, though, as we are only using it to split the second, and we can do that at the same time as the zipping.

Using this function in build looks like the following:

That top-level zipWith is also unnecessary, though. If we make the program circular, we can produce ts2 as we consume it, making the whole thing single-pass.

That zip3Node is a good candidate for rewriting as a fold, also, making the whole thing look like this:

To fuse all of those definitions, we first will need to rewrite rows as a fold:

Once we have everything as a fold, the rest of the transformation is pretty mechanical. At the end of it all, we get the following linear-time function for constructing a Braun tree from a list:

References

Bird, R. S. 1984. “Using Circular Programs to Eliminate Multiple Traversals of Data.� Acta Inf. 21 (3) (October): 239–250. doi:10.1007/BF00264249. http://dx.doi.org/10.1007/BF00264249.

Danvy, Olivier, and Mayer Goldberg. 2005. “There and Back Again.� BRICS Report Series 12 (3). doi:10.7146/brics.v12i3.21869. https://tidsskrift.dk/brics/article/view/21869.

Okasaki, Chris. 1997. “Three Algorithms on Braun Trees.� Journal of Functional Programming 7 (6) (November): 661–666. doi:10.1017/S0956796897002876. https://www.eecs.northwestern.edu/~robby/courses/395-495-2013-fall/three-algorithms-on-braun-trees.pdf.

OUR VALUED CUSTOMERS: To his friend...

Planet Haskell: Brandon Simmons: A little tool for visualizing ghc verbose timing

I wrote a little tool to graph the fine-grained timing logs produced by ghc when the -v (verbose) flag is given. These logs to STDERR look like, e.g.

!!! Liberate case [Data.Binary.Class]: finished in 32.06 milliseconds, allocated 14.879 megabytes
!!! Simplifier [Data.Binary.Class]: finished in 873.97 milliseconds, allocated 563.867 megabytes

The project is on GitHub at jberryman/ghc-timing-treemap, along with a screenshot.

Navigating within the graph is pretty slow at least in firefox, and there are some other improvements that could be made, for instance some of the phases are run multiple times on the same module and it would be nice to see these grouped, where the module name is logged.

Daniel Lemire's blog: Almost picking N distinct numbers at random

In Picking N distinct numbers at random: how to do it fast?, I describe how to quickly pick N distinct values are random from a range of values. It comes down to using either bitset/bitmap or a hash set.

The bitset approach is very fast when you need to pick many values out of a small range. The hash set approach is very fast if you need to pick very few values, irrespective of the range of values.

What about the middle ground? What if you need to pick lots of values from an even greater range?

Because N is large, you may not care to get exactly N values. That is, if you need to pick 100 million values at random, it might be fine to pick 99,999 values.

What can you do?

  1. Fill an array with N randomly generated values (using a uniform distribution).
  2. Sort the array.
  3. Remove duplicates.

That is pretty good, but the sort function could be expensive if N is large: it is O(N log N), after all.

Assuming that there are no duplicates, can we model this using probabilities? What is the distribution corresponding to the first value? We have N values picked out of a large range. So the probability that any value has been picked is N over the range. We recognize the geometric distribution. Once you have found the first value, you can repeat this same reasoning except that we now have N-1 values over a somewhat restricted range (because we generate them in sorted order).

  1. Generate a value over the range R using a geometric distribution with probability N/R.
  2. Append the value to my output array.
  3. Reduce the range R with the constraint that all future values must be larger than the last value appended to the output array.
  4. Decrement N by one.
  5. Repeat until N is 0.

You can use the fact that we can cheaply generate numbers according to a geometric distribution:

floor(log(random_number_in_0_1()) /log(1-p));

All you need is a way to generate random numbers in the unit interval [0,1] but that is easy. In C++ and many other programming languages, you have builtin support for geometric distributions.

The net result is an O(N) algorithm to pick N values at random over a range.

There is a catch, however. My model is not quite correct. For one thing, we do not quite have a geometric distribution: it is only valid if the range is very, very large. This manifests itself by the fact that the values I generate may exceed the range (a geometric distribution is unbounded). We can patch things up by stopping the algorithm once a value exceeds the range or some other anomaly occurs.

So I ran a benchmark where I have to pick 100,000,000 values among all integers smaller than 40,000,000,000. I get that the time per value generated is about half using the geometric-distribution approach:

sort-based 170 ns
geometric 80 ns

For larger arrays, I can achieve 3x to 4x gains but then my software runs out of memory.

My code is available.

What else could you do? Another practical approach would be to divide up the range into many small subranges and to use the fact that the number of values within each subrange follows a binomial distribution (which can be approximated by a normal distribution), to do a divide-and-conquer approach: instead of having a pick many values in a large range problem, we would have several small “pick few values into a small range” problems. For each small problem, you can afford to do a sort-based approach since sorting small arrays is fast.

OCaml Weekly News: OCaml Weekly News, 07 May 2019

  1. Function that outputs a functor
  2. New (significant) version of Tablecloth
  3. OCaml Users and Developers Workshop 2019: Call for presentations
  4. Writing internal DSLs with Ocaml
  5. Engineer to work at Inria-Paris (LIP6) on Coccinelle
  6. OCaml syntax support added to astexplorer.net
  7. 2nd Call for Contributions: Summer BOB 2019 [Aug 21, Berlin, deadline May 17]
  8. Let+ syntax backported to OCaml >= 4.02
  9. ppx_mysql 1.0
  10. The future of ppx
  11. Dune: serious bug found in the design of library variants
  12. Other OCaml News

CreativeApplications.Net: Sensorium Festival 2019: The Augmented Mind and Where to Look For It

Sensorium Festival 2019: The Augmented Mind and Where to Look For It
From the inventions of computing pioneer Douglas Engelbart to the philosophies of Andy Clark and David Chalmers: curator Philo van Kemenade reveals what inspired the 2019 edition of Bratislava’s Sensorium Festival (June 7-9)

Tea Masters: The aged tea event at PSU

A tea box from Hong Kong from the 1930s
2 weeks ago, Teaparker and I returned a 6th time to Penn State in order to teach tea to a student organization created by Jason Cohen in 2010. After Oolong, Puerh, Chaozhou gongfu cha, Yixing teapots, Porcelain, this year's theme was Aged Teas.

The concept of 'aged tea' is actually rather new. For a long time, tea was considered best young and fresh. In the 1980s, very few people appreciated aged puerh or aged Oolong. Then, Teaparker recalls paying 40 USD for a red mark cake of puerh from the 1950s, which now costs well over 100,000 USD! This was just a little bit more expensive than good Taiwanese Dong Ding Oolong, but it was already much more than a new cake (1 or 2 USD at the time). This shows that the rare merchants and lovers of aged puerh already recognized the value these teas 40 years ago.
In recent years, tea auctions in Hong Kong and Shanghai have helped popularize aged puerh. Record prices make for great headlines and have grabbed the imagination of tea collectors all around Asia. A Song Pin Hao tong (= 7 cakes) from the 1920s set a record of 13.3 million HKD or 1.7 million USD! That's over 700 USD per gram (vs 41 USD for gold)!
A Qing dynasty pewter caddy for Da Hong Pao from Tian Xin
The craze for aged teas doesn't stop at puerh anymore. Oolongs, white teas, even green teas can age well under the right circumstances, as Teaparker demonstrated with a 40 years old Shi Feng Longjing green tea from his collection.
But not everything that is old is gold! The wine market is a good example: only very fine wines see the price of their past vintages increase consistently. Common, every day wines don't see their value or their price increase.
The main thing to understand is the value, the quality of the tea. What is the difference between a tea that is aged and valuable, and one that is old, past its prime?
This was probably the major benefit of attending this 4 days event: the tea students at PSU got the opportunity to smell and taste some marvelous aged teas. A 1950s red mark puerh, a scented Baozhong from the 1960s, a 1975 7542, a Wuyi Yan Cha from the 1980s, a Dong Ding Oolong from spring 1982, a TGY from 1990...
What are the common traits of all these well aged teas? They are very clean in terms of scents and appearance. The brew is shiny, not foggy. The dry scents are light.
The taste is pure, smooth and has a long aftertaste. It doesn't feel old, but exhales an energy that can be felt in your body.
This kind of aged tea is very different from an old, tired, flawed tea. It's a little bit like the difference when a professional pianist plays these versions of 'Twinkle, twinkle little star' and when a young child (that is not yours!) plays the same first notes. On the one hand you have art, perfection, beauty and on the other you have something very rough and uncomfortable for your ears!
For music, it's usually quite easy to tell which is art and which is beginner's level. For tea, this can be much more tricky, especially if the environment is beautiful, the people nice, the story well told... And if you have never had a great cup of aged tea! That's why it was important to let them experience good aged teas.
This education is extremely valuable for at least 2 reasons. First, it will prevent these students from making simple purchasing mistakes. The very high prices of aged tea have created lots of incentives to sellers to offer expensive teas that are fake or simply old and bad, or that exaggerate their age.
Second, it gives these tea drinkers a glimpse into the world of the most refined teas. I didn't see anybody not liking these teas. But, of course, how much one likes (and values) these teas is very personal and varies from one person to another. This is a good motivator to start your tea collection early, so that you don't have to your house in order to enjoy a 90 years old Song Pin Hao!
There are several strategies to build your own tea collection:
1. start small and experiment with different teas, jars to get a feel how different teas evolve with time.
2. choose young teas that have a good reputation and a good potential to age well. The best candidates are sheng gushu puerh and medium roasted Oolongs. When it comes to aging, quality is more important than quantity. 1 kg of roasted SiJiChun won't age as nicely as 100 gr of Hung Shui Oolong from Dong Ding or Alishan. Better age 1 excellent puerh cake than 1 tong of plantation puerh.
3. If, like me, you are much closer to 50 than 20 or 30, then you may also consider teas that are 10 to 20 years old and that show the right aging potential and reasonable prices. This 1999 7542 or this 1998 Hung Shui Oolong from Lishan would give you a good head start at a reasonable premium.
My own conclusion from this aged tea event is that aged teas don't taste old, but elegant and extremely smooth. They linger on the palate and their pleasure is both long lasting and very unique. The lesson is that their beauty has less to do with their age than with their intrinsic qualities. These teas don't become excellent by miracle, but because they had this potential in their leaves from the very beginning. Appreciating aged teas also helps us better define what is a good young tea.
Remember the perfection of these aged masterpieces and share your tea happiness!
Note: Looking again at these pictures, I get to see the concentration, interest and passion of these students. Even with more than 20 students at a time, they'd manage to focus all their attention on the subject of tea. This made teaching and learning so easy! Thanks to all who attended the classes.

Explosm.net: Comic for 2019.05.07

New Cyanide and Happiness Comic

Penny Arcade: News Post: Wranglin&#8217;

Tycho: Gabriel and his clan, the Crewhuliks, have been dining on a choice slice of free expansion content for Sea of Thieves called “Tall Tales: Shores of Gold” which sounds piratical as all get-out.  It makes it seem like some pirate shit is gonna happen - and fast. Alongside something that can only be considered a “campaign,” they’ve offered a new activity - the ability to fish.  Fish can now be caught, grilled, or even married, so be very careful because a lot of this stuff is in the same menu and an errant flick of the wrist could consign you to some…

Penny Arcade: News Post: Overwatch Action Figures!

Gabe: I used to love playing with action figures as a kid and I was fortunate enough to grow up during what surely must have been their golden age. In the 80’s it seemed like we got a new world full of heroes and villains every Saturday. You had the big ones like G.I. Joe, He-Man and Transformers but then there were hundreds of smaller settings like M.A.S.K., Jayce and the Wheeled Warriors, Centurions, and Visionaries. Even as a kid I recognized on some level that these were simply commercials for toys but I loved them all anyways. I was obsessed with these worlds and the very first stories I…

Perlsphere: What's new on CPAN - April 2019

Welcome to “What’s new on CPAN”, a curated look at last month’s new CPAN uploads for your reading and programming pleasure. Enjoy!

APIs & Apps

Data

Development & Version Control

  • Future::IO provides Future-returning IO methods
  • IPC::ReadpipeX bypasses the shell with the list form of readpipe/qx/backticks for capturing output
  • PerlIO::normalize provides a PerlIO layer to normalize unicode strings on input and output
  • Get a web interface for viewing and inspecting yath test logs using Test2::Harness::UI

Web

MattCha's Blog: 2008 Fangmingyuan Bama, Again!


A surprise from Tiago of Tea Encounter. Dry leaves smell of sweet misty rainforests, distant sweet berries, sweet woody odours.  Sweet very pungent taste very cooling finish. Pungent camphorus wood and a creamy long sweetness. The storage seems drier than other Fangmingyuan puerh from Tea Encounter . Sticky lips and tongue feeling.  Faint bubble gum breath.  Relaxing qi even spacy and floating body type of qi.  Nice apricot and wood in late infusions.  At $88.89 for 400g cake or$0.22/g , I think I should buy this one again

Peace

Quiet Earth: Sci-Fi Mega Blockbuster THE WANDERING EARTH Just Hit Netflix

One of the biggest and most successful sci-fi films of all time has just landed on Netflix, but you probably didn't notice!

Chinese blockbuster The Wandering Earth has broken box office records internationally raking in approximately $700 million. Netflix picked it up and, well, just dropped it without really making much of a thing about it.


Synopsis:
The sun is dying out. The earth will soon be engulfed by the inflating sun. To save the human civilization, scientists draw up an escape plan that will bring the whole human race from danger.

With the help of thousands of infusion powered engines, the planet earth will leave the solar system and embark on a 2,500 year journey to the orbit of a star 4.5 light years away.


Check out the trailer for The Wa [Continued ...]

Penny Arcade: Comic: Wranglin&#8217;

New Comic: Wranglin’

Michael Geist: The LawBytes Podcast, Episode 10: Lowdown on Lawsuits – James Plotkin on Copyright Threats, Notices, and Lawsuits

Copyright threats and lawsuits against individuals have been around in Canada since 2004, when they were rejected by the federal court. Those threats receded for about a decade, but now they’re back. Copyright notices, litigation threats, settlement demands, and actual lawsuits have re-emerged at the very time that the music and movie industries are experiencing record music streaming revenues in Canada and massive popularity of online video services. James Plotkin, a lawyer with Caza Saikaley in Ottawa, joins the podcast this week to help sort through what the notices mean, the implications of the threats and lawsuits, and where Canadian law stands on the issue.

The podcast can be downloaded here and is embedded below. The transcript is posted at the bottom of this post or can be accessed here. Subscribe to the podcast via Apple Podcast, Google Play, Spotify or the RSS feed. Updates on the podcast on Twitter at @Lawbytespod.

Credits:

House of Commons, November 27, 2018
CBC News, Infringement Notices
CTV News, Company Collects 1 Million IP Addresses of Canadians Suspected of Illegally Downloading
CBC, As it Happens
CBC, Mainstreet NS

Transcript:

LawBytes Podcast, Episode10.mp3 | Convert audio-to-text with Sonix

Michael Geist:
This is Law Bytes, a podcast with Michael Geist

David Lametti:
What began to happen in Canada, Mr. Speaker and I saw this myself a number of times in my teaching is that American rights holders through American law firms would often allege content infringement in Canada and send out a letter saying you’ve You’ve infringed copyright. We’re going to sue you please pay x thousands of dollars by clicking on this link and we will forget this. And sadly Mr. Speaker a number of people didn’t realize that this kind of claim was actually being made against Canadian law in contravention of Canadian law and actually paid out.

Michael Geist:
Copyright threats and lawsuits against individuals have been around for awhile. The Canadian Recording Industry Association, which now calls itself Music Canada, led the charge way back in 2003 with threats to sue Napster users. Lawsuits were launched a year later but were rejected by the Federal Court citing a confluence of concerns involving evidence privacy and the state of Canadian copyright law. Those threats receded for about a decade but now they’re back. Copyright notices, litigation threats, settlement demands, and actual lawsuits have reemerged at the very time the music and movie industries are experiencing record streaming revenues and massive popularity of online services. There’s a lot of confusion and concern about what’s happening. What the notices mean,the implications of the threats and lawsuits, and where Canadian law stands on the issue. Here to help sort through what’s going on is James Plotkin, a lawyer with Casa Saikeley in Ottawa who has taken on several of these copyright cases.

Michael Geist:
Grateful to you for coming on and helping unpack a little bit what’s taking place. I feel like for this episode in particular we need the caveat this is not legal advice we’re having a conversation about our understanding of the law.

James Plotkin:
Of course I would never give legal advice into a microphone at a podcast.

Michael Geist:
Good to hear no nor should you. So why don’t we start by clarifying the difference between the threats that people are seeing often times through the notice and notice system as opposed to the lawsuits and why don’t we start with the threats and the notice and notice system.

CBC News:
Did you hear the one about the 86 year old grandmother who is facing a five thousand dollar fine for illegally downloading a zombie killing video game. It’s no Halloween joke. Ontario’s Christine McMillan recently received two emails claiming she had illegally downloaded Metro 2033. She says she’s innocent and insists her wireless connection may have been hacked. Well guilty or not, McMillan is one of tens of thousands of Canadians who received similar notices. Part of the new rules that came came in under changes to Canada’s Copyright Act.

James Plotkin:
The notice notice system came in 2012 when Canada modernized its copyright legislation and this was heralded as a quote unquote made in Canada approach to dealing with online piracy of music, films and other copyrighted content. This was in juxtaposition with the notice and takedown system that was in effect and has been in effect in the United States since about 2000 under their copyright legislation called the Digital Millennium Copyright Act. And so the way the notice and notice system works essentially is a rights holder may send a notice of purported infringement or notice of infringement to an ISP and the ISP then without actually disclosing the identity of the subscriber forwards that to the subscriber with usually an introductory text saying you know we received this from the rights holder. We have not verified its veracity but here it is. And then the content of the notice comes to the individual.

Michael Geist:
So the any piracy agency or the rights holder whoever happens to be sending these notices doesn’t know the end of the identity of the individual and the ISP isn’t disclosing it. They’re merely serving as a conduit to transfer on this message.

James Plotkin:
Correct. And that’s under the notice and notice regime and I’m sure we’ll talk a little bit more about some other mechanisms that rights holders are used to in fact get at that information. But the notice and notice is an administrative process essentially that’s supposed to act as an educational and deterrence tool to individuals who hopefully by receiving one or more of these notices will curb whatever habits that they have been doing online to the extent that those individuals subscribers themselves actually have done the infringement. That’s a totally different matter. They may not have been the individuals who infringed if anyone did at all.

Michael Geist:
Okay right. So that I think that’s important point to make. So we’re talking here about an allegation based on some entity trying to monitor activity online. This isn’t proof, it’s not determinative it’s not a judgment.

James Plotkin:
It’s not proof it’s not determinative. And again it’s meant to be a notice it’s not any proof of that leak first of all that legal rights even exist and to the extent that they do that they’ve been infringed. And to the extent that they exist existing have been infringed that they’ve been infringed by the individual who receives the notice because we have to recall that the ISP forwards this information to their subscriber. But if you have six people living in your house and 20 people visiting and using the Wi-Fi it might very well be that someone other than that subscriber did the infringement to the extent that there was one.

Michael Geist:
OK fair enough. And do the ISP have to forward on these notices?

James Plotkin:
And in fact they do. And while there are no damages or any remedies against individuals who received notices there are statutory damages under Section 41.26 of the Copyright Act that can be levied against the ISP for failing to comply with the system and in fact one of these rights holders called me to productions is currently pursuing TekSavvy a Canadian ISP attempting to get these statutory damages for alleged failures to properly forward these notices.

Michael Geist:
Ok so the internet providers themselves obviously aren’t self generating the notices they’re serving as this conduit forwarding on the notices and under the current system effectively they’ve got no real choice but to forward on those notices or the ISP itself faces the prospect of real liability.

James Plotkin:
That’s correct.

Michael Geist:
And it’s interesting. My understanding is the systems was in place well before it became formalized within the copyright act in 2012 and then took a couple of years until it actually took effect but this was used on an informal basis for for many years and seemed to have some amount of effectiveness in terms of addressing the behaviour that been talking about. So where did we go wrong in terms of what what we see taking place right now.

James Plotkin:
Well I suppose whether we went wrong depends on depends on who you ask. My one of my issues with this system from my perspective is that up until recently and and quite recently in fact the content of these notices were not regulated at all. So rights holders could essentially put in whatever they want. And I think there may be varying levels of classiness by people in their notices I think some had more of a shakedown approach where as others were actually trying to educate and deter further infringements. And this all changed. Recent about six months ago in fact when parliament part of the budget bill. I believe it was C-86 introduced an amendment to Section 41.25 of the act and added a subsection 3 which prohibited the inclusion of certain content and among that offers to settle and any request or demand from it for either payment for an alleged infringement or for personal information. So this could not be used as a way to actually get at the person’s identity or to extract a settlement from them directly.

Michael Geist:
Ok so when you say some were being used as a shakedown essentially what you’re saying is that some were putting in some sort of settlement demand or legal demands in the notification itself.

CTV News:
Frankly this sounds a little bit creepy. So now I know what it does. I think you know if there’s a million Canadians out there million Canadians being monitored who exactly is doing the monitoring and what kinds of information do they have. Ok. Well we are one company that is doing monitoring of pirated content. So we’re not monitoring people we’re monitoring the pirated content. If people go to the pirated content to download it they may become subject to our monitoring effort.

James Plotkin:
Yes. I’ve seen I’ve heard anecdotally and in one instance I’ve actually seen one of these notices from from way back and yeah the content as I said there are varying degrees of classiness. I mean I recently a client of mine showed me one from HBO and they didn’t do that. It was really more of an education approach. They didn’t threaten the lawsuit they didn’t threaten to take legal action but you know left them on the table of course in the event the infringement continued and he also said that it came remarkably quickly. So this individual had downloaded an episode of the popular show Game of Thrones and he said that shortly within five or 10 minutes after having completed the download he had received a notice from HBO or from the ISP or HBO via the ISP quite quickly in fact and so they asked why.

Michael Geist:
Within minutes of downloading it went from HBO or ever does the monitoring for them to the ISP and then forwarded along directly the subscriber.

James Plotkin:
Within minutes and so it seems HBO might have even a more advanced content surveillance system than others but that’s I’m just speculating.

Michael Geist:
Well that’s pretty amazing to hear the speed with which they were able to do that. Now for those that were so-called less classy they’re including I assume they included some kind of link ultimately to saying if you if you pay if you please click on this link or click on this link and there there would be some sort of page that would allow the person to pay a fee presumably and then settle the claim they’re doing so without knowing who the person is, without having proven the allegation, they’re just in a sense taking advantage of a subscriber who might not be aware of the fact that they don’t know any of that and simply pay out of fear for what might come next.

James Plotkin:
That’s right. And that’s well that’s what I’ve heard and I. To be fair I haven’t seen one of those with the link itself but I have heard again anecdotally stories of that nature and you know is frightening language is intended to extract settlement from people who may or very well may not have perpetrated the act of alleged infringement. And so that I think was a concern and I think it was borne of that concern that this new amendment to the law came in effect and actually started regulating the content and I know that this was criticized by certainly some people for not having happened sooner.

Michael Geist:
Sure. And I think I was one of those people.

James Plotkin:
You might have been.

Michael Geist:
I should note that I have seen some of those those notices there was a period of time where I was getting e-mails from recipients, certainly on a weekly basis. Not sure what to do. And and left left feeling rather helpless and a little bit hopeless because it can be I think for a lot of people are pretty scary thing to receive what feels like a legal demand. So the government sought to address this by by prescribing as you suggest limits on the language.

CBC As It Happens:
What happens when say HBO finds out that you have been illegally downloading the game of thrones is that they send an e-mail to your Internet provider. And as of Canadian rules that came into effect in January 2015 your provider then has to ask that e-mail on to you. And that is what has been happening at the University of Manitoba repeatedly.

Michael Geist:
Is this likely to to address the problem.

James Plotkin:
Well I suppose time will tell and I can’t claim to have any real empirical information on this although one hopes that to the extent that the notices can no longer demand settlement or give any kind of indication that liability has been proven or found or that the person who receives notice themselves are the ones who are liable if anyone should hopefully cause people at least the careful readers to maybe look at blogs like yours or other sources online to kind of you know inform themselves as to what these notices really are and I guess the most we can hope for is outside education spurred on by a less aggressive content in the notices.

Michael Geist:
Let’s hope so, in the sense if you believe this is a problem and I certainly do. I know for myself having spoken to at least a couple of ISPs there is a lingering concern that the they may still be forwarding on many of these notices and I think at least anecdotally that’s what’s taking place in part because there is while there may be rules now about what can be included, there is no standardization in terms of what’s included in the notices. Is more or less classy so to speak. And if you’re trying to forward on those notices as quickly as apparently they are sometimes literally doing it within minutes, the ability to actually dig into whether or not it includes any of the content that might be offside with the government wants to see in a notice represents a significant challenge because it doesn’t say that you can’t forward that notice it merely says you’re not required to forward it on a notification if it includes that information.

James Plotkin:
Right and I think to that point and it’s a good one. These processes obviously in the HBO case but certainly as far as I’m aware across all of the ISP certainly the big ones is is an automated electronic process. It’s not a manual one. And for those who are more interested about kind of the mechanics at least about how Rogers does this I would commend to you to look at the oral submissions and read the facta of the parties to the Voltage and Rogers case that the Supreme Court decided recently because they go into some detail. Counsel for Rogers goes into some detail on really how this is done on the ground and it’s certainly not an individual saying oh well here’s a notice and here’s what it says and they’re not really checking it for compliance in that way as far as I’m aware. And as far as the argument indicated to me when I viewed it right.

Michael Geist:
And I think that’s my understanding too the numbers are just too big. Terms of being able to look at this individually for an ISPs perspective that provides a useful segue into the other side of the story and in the sense the Rogers Voltage case sort of sits a little bit in the middle with some of the stuff because it of course had references to notice and notice and litigation as well perhaps why don’t we unpack a little bit that case and then get into the other side of what we’re seeing taking place which is the lawsuits side.

James Plotkin:
Sure. The Rogers and voltage cases actually I think useful in a couple of respects. But if I boil it to its essentials it was really about whether and to what extent an ISP can charge for the cost of complying with what’s called Norwich order and explain what that isn’t just a second. Over and above what they have to incur to affect their duties under the notice and notice regime because under the Copyright Act the there are certain obligations in the court in this case and both express and from those flowing from them implied obligations as to information retention management sending the information along as regards the notice and notice regime. And currently while the law leaves room for the Governor in Council to permit the ISP to charge for that currently there is no such prescription and so for that reason as interpreted by the courts, the ISP is are forbidden from charging any fee for compliance with the notice notice regime.

Michael Geist:
Okay, let’s just pause for a second to make sure that that everyone’s clear on that system we’ve talked about how ISP ISPs are processing large numbers of of these. The system envisions the possibility of ISP charging for this but only if the government sets a fee. And at the moment the government hasn’t set a fee. So from an ISP perspective one of the reasons presumably they have sought to automate this isn’t just volume but it’s of course the cost because at the moment all of those costs are being incurred by ISPs and ultimately arguably it’s going to be subscribers, Internet users that are going to bear some of these costs assuming that those costs get offloaded at the end of the day as part of what we pay monthly for Internet services.

James Plotkin:
Correct. Okay I agree.

Michael Geist:
All right. So they’re not charging for it. The in the case case in this case. So in the Supreme Court of Canada case talks about that interface between the notices and then Norwich order.

James Plotkin:
Okay. So Norwich order is essentially a third party discovery order that permits somebody in this case a rights holder to obtain information from a non-party that isn’t necessary to prosecute the action. And so in this case the way these lawsuits have gone and there were a number of them there’s the famous voltage pictures reverse class action. But in fact there’s 16 or 17 other these lawsuits smaller ones going in the federal court and you can talk about that later. The way that these work is the rights holder generally enters into an agreement with a third party Internet surveillance company that monitors the BitTorrent protocol to ascertain which IP addresses are in the swarm and uploading and making available the work at any given point in time and then with the IP address the rights holder can determine which ISP the person is with but does not have their identity and so in order to actually send the individual a statement of claim and serve them and get the actual legal process going against them what they do is begin the claim against John Doe’s against basically placeholder defendants obtain Norwich orders from various ISP is to disclose the subscriber information and then sue those individuals. And that’s essentially the the system that has been going in the federal court now for a couple of years.

Michael Geist:
So rights holders or at least the agents working on behalf of these rights holders actively monitoring internet traffic identify IP addresses. But in doing so don’t necessarily know who those individuals are though they will know from the IP address block which which provider the person happens to be using it. And once they’ve done that they’re then able to use this legal process to effectively require the Internet provider to disclose the identity of the subscriber so that they can proceed with their legal action.

James Plotkin:
Exactly and now looping back into Rogers and Voltage that case was about what again as I said whether and to what extent the ISP can charge for the cost of complying with the Norwich order for any activities that are not expressly or impliedly already required to discharge their notice and notice obligations because as we discussed they can’t charge for that and there is the court found some overlap as a technical matter in what has to be done for one in the other. But the court found that it was not an entire overlap and therefore sent it back to the federal court to actually determine the dollars and cents issues.

Michael Geist:
Ok. So that particular case ultimately going to Supreme Court of Canada as you as you indicate leaves open the possibility that at least for the sort of that second stage where if someone is looking now to actually engage in a legal process and sue Internet users there is the possibility that the ISP will can levy some of their costs not the ones that involve the cost of complying for the notice and notice system but additional costs that might be additive that are specific to this kind of litigation.

James Plotkin:
Certainly and it could really have a big impact on where the court actually lands on how much can be charge in any given case might have an impact because there’s a big difference being five cents and five dollars and you’re talking about the cost of retrieving thousands and thousands of records. And given that the statutory damages which is likely the remedy that the rights holders seek to obtain on the assumption that they actually want to adjudicate these things on the merits for non-commercial infringement ranges between only one hundred and five thousand dollars there does become a potential cost recovery issue if the cost of obtaining the identities of the would be defendants is restrictive or prohibitively high.

Michael Geist:
And I want to continue with the litigation but pause for just one moment because that notion of increased costs to least obtain that information might have a real impact on the ability to pursue this. Is that part in your view of kind of the system as a whole which it seems like is reliant on the notion that individual internet users will not fight that the only way that you can make this work if you are bringing lots and lots of potential actions against Internet users is essentially based on keeping your costs very low either through threats like we saw with notice and notice or through litigation which settles quickly because the moment you start increasing the costs of litigation either to get the in get the identity of the subscriber or potential even further once you actually formally sued having a subscriber say well I didn’t do this or I don’t think I’m liable and I’m going to fight you suddenly now the the system of trying to extract some kind of revenue from these individuals kind of withers away because the costs become prohibitive.

James Plotkin:
Correct. And I think there are two sides to that as well. And certainly the rights holders and this argument was made by Voltage in the Supreme Court. This is for them they say in access to justice issue because these are you know individuals or corporations who want to enforce their copyright and they’re saying that based on the system as it is procedurally again that the costs of litigation coupled with the low statutory damages and the frankly inability to prove real common law damages as far as I can tell makes it you know as a practical matter untenable for them to enforce their copyrights. That’s their side of the story. And then the other side as well that perhaps the court system is being used here as a settlement mill in a way that is not necessarily commensurate with adjudication on the merits which is generally the goal of most courts when when claims are brought of course settlement is always encouraged to unburden the court system. But if you’re using the court system specifically as a settlement device rather than for an adjudication I mean query whether the courts will ultimately be happy with that. But you know it has been going on for a couple of years and so at this point I don’t know that anybody has really pressed the issue.

Michael Geist:
Fair enough. Now let’s talk about what’s been going on for the last couple of years. You mentioned there are at least a couple of different kinds of lawsuits taking place maybe you can unpack that a little.

CBC News:
People over forty five years old can remember a time when you wanted to watch a movie you had to either go to the movie theater or wait for it to play on television. There were no VCR is there was no Netflix, no computers 40 years ago. How the world has changed now people can watch movies whenever they want. And for more choice and titles some people choose to download movies using BitTorrent that are distributing material for free. That has resulted in some people getting letters from production companies saying they’re being sued in what’s called a reverse class action.

James Plotkin:
So the one that everybody knows most about is the is the Voltage pictures John Doe reverse class action so-called that is proceeding in respect of the Hurt Locker film. And this was the first one of these Voltage Pictures actions to go although what’s getting a significantly less attention and maybe a little bit more now is the 16 or so other lawsuits started by other movie studios that all appear to be linked with Voltage. I mean for instance all of these films are within the voltage catalogue number one and they’re also all represented by the same counsel. So it seems to me that while these are different plaintiffs in name these there is a common design here. And so these are not proceeding procedurally as reverse class actions or at least not yet or at least not on a formal basis because the lawsuits are commenced as simplified actions which is the simplified rules under the federal courts rules dedicated to cases where the monetary relief in question is fifty thousand dollars or less and there are a few other restrictions so I won’t bother getting into. But what they’re doing is they are once again suing hundreds in some cases well over a thousand individuals with respect to each of these films but they manage to do so by issuing a single statement of claim for fifty dollars whereas any defendant who wishes to defend on the merits because this is not a class defence must actually defend on their own and do so individually retain counsel represent themselves or whatever.

James Plotkin:
So the access to justice and cost savings do exist. But I would I would submit that they are perhaps a little one sided in favour of the plaintiff in this instance and the clients that I represent I have a close to a dozen active files in these matters none of them are in the Voltage Pictures because that’s a there’s a whole different issue there with certification. But the individuals who are getting statements of claim in the mail in these other cases are in frankly a more pressing position because they have deadlines to defend or negotiate or settle or and they often really don’t know what to do when they come to me which is what I try to help them with.

Michael Geist:
Ok. Let’s just deal with the reverse class action quickly and then move on to the kinds of cases that you’ve been dealing with. So people may be familiar with class actions where there’s a large pool of individuals who may have been harmed in a certain circumstance. Individually their claim may not be worth that much but collectively it may make economic sense to come together and thus use the class action system. What’s a reverse class action.

James Plotkin:
Well true to its name it is the opposite of the normal class action wherein the plaintiff is the class. You have a representative plaintiff and you know a class of individuals with whom that person shares common issues and points of contention with one or more defendants. In this instance, it is one plaintiff and a pool of defendants who purportedly have engaged in activities that are common in law or in fact such that they are susceptible to class wide resolution and to just unpack that for a minute. A lot of the time people think that a class action always precedes with respect to the entirety of the claim but that’s not always the case. There are certain and individual issues that are not generally susceptible to class wide determination. You know so for instance causation in a negligence claim would be one damages in any sort of torts claim and indeed in this instance probably would also have to be an individual issue because where within the statutory damages the amount should fall with respect to any individual is dependent on the facts of their case and the factors in the law that are weighed to that effect. So even though this is proceeding as a reverse class action and there might be some common determinations for example that the plaintiff owns copyright in the work that is something that it would could be a common issue and is susceptible to class wide determination because it’s the same in respect of every defendant. But the fact that this is a reverse class action does not mean that everything will necessarily be certified as a common issue and therefore proceed on a class wide basis.

Michael Geist:
And where are we right now with respect to this that this particular court quite large claim?

James Plotkin:
So we’re now approaching the certification phase. So up until now the class has not as far as I know been exhaustively defined. And there was a security for costs motion that was brought by the representative defendant Mr. Solna and he was successful in obtaining security for costs but not mistaken the amount of roughly seventy five thousand dollars and I think that was fought but ultimately that has now been paid and so now that that’s been paid the process has been unfrozen and to the best of my knowledge and to be fair I haven’t looked at the federal court proceeding queries in a little while on this but I believe they’re now at the certification stage to figure out as I was saying what will be certified, whether there will be a certified class action and if so what issues will be certified for common determination.

Michael Geist:
So there’s still several legal hoops to go through here whether or not this gets certified at all and if it does on what basis and then of course there’s the prospect of potential litigation on those issues because that still doesn’t prove that that the in this case let’s say the individual users themselves infringed copyright in violation of let’s say one if that becomes one of the issues that does get certified.

James Plotkin:
Mm hmm. And also it’s worth noting that at certification in the ordinary course in a plaintiff’s side class action there is an opportunity and a mechanism for plaintiffs to opt out of the class so that they can pursue things individually. So likewise in this case one would think that defendants at the certification phase or thereafter would have a mechanism for opting out of the class as defendants. So for instance if you know a defendant number one wants to hire their own lawyer and doesn’t want class counsel they should have an opportunity to do so. And in my view that’s a that’s an important procedural fairness and rule of law issue that ought to be respected in the reverse class action context as well.

Michael Geist:
So in other words I don’t want to be sued under this class action I should have the right to say I don’t want to be sued under this class action.

James Plotkin:
And pursue me individually that that that should be a right and again at this point it’s all kind of up in the air. I know the reverse class action has some precedent in the provincial courts. The case law is kind of few and far between there and I’m not aware of any cases certainly in the IP context where this has happened in Canada. And I believe everyone agrees it’s quite unprecedented. So we’ll have to see how it goes.

Michael Geist:
So really a novel case the the other aspect of litigation you’ve alluded to already is that there are whole series of other cases that are proceeding not in the reverse class action side but as more traditional cases although using potentially some some tactics within the Federal Court rules of the Federal Court that raise some issues perhaps we can unpack that a little bit.

James Plotkin:
Sure. And I think it’s worth noting that the statements of claim in all of these actions at least the ones that I’ve seen and I’ve seen certainly over half a dozen of them are essentially identical in boilerplate with modifications for the name of the work and other such things. But really these claims are proceeding on two different theories of liability. Theory number one: you the Internet subscriber or the person who downloaded uploaded made the work available communicated to the public by telecommunication so on and so forth and therefore you are liable for infringement for being the person who did the things that only the copyright owner can do. The second theory is what they’re basically going is an authorization of infringement theory. So they’re saying well even if you’re not the individual who who perpetrated the act yourself it’s your internet connection and you are responsible in fact the words negligence and willful blindness if I’m not mistaken or even used in the pleading to suggest that something of a duty of care is owed to I suppose rights holders that if you’re the internet subscriber you’re responsible for what happens on your connection and therefore you have infringed by authorization. And that second theory is the one that I in particular I’m interested in challenging because as I read the case law that’s quite a stretch.

Michael Geist:
Well it would be remarkable stretch. It’s essentially saying that anyone who has an Internet service is responsible for a duty of care for how it gets used not just by themselves but anyone who might gain access just to even things like open networks are essentially forbidden because how could you meet your duty of care if you didn’t know necessarily who was accessing your network.

James Plotkin:
Correct. And there is case law on authorization of copyright infringement most notably. Well there are a couple of cases but most notably the there’s the BMG case and there’s also the CCH decision from 2004 which is mainly known as the benchmark decision on fair dealing in Canadian copyright law but it also deals with the notion of authorization. And in that case there is an argument that by making available the photocopiers in the Great Library the law society was actually authorizing copyright infringement to the extent that those who use those photocopiers use them to copy more than a fair amount of the work or any book or periodical or what have you. And the court said no, authorization is I can’t recall off the top of my head the words, but instead to countenance to sanction to to to actually proactively do something. The court does say that that as far as the evidentiary burden a quote unquote sufficient degree of indifference might in some instances amount to authorization although I won’t get into the background of this they cite some UK case law that again suggests to me does not support what the plaintiffs are doing here. But more importantly the court also says there is a presumption that when one authorizes someone to use a technology the presumption is that they’re authorizing only licit legal uses. And so that presumption can of course be rebutted. But the fact that in that instance the law society made the photocopiers available was not enough to show authorization and I would suggest that likewise here the fact that the mere fact that I allow you to use my Internet connection does not mean I allow you to do illegal things on it. I mean if we take this argument to the absurd if you use my Internet connection to go on Silk Road and hire an assassin I’m now you know an accessory to a criminal act and you know that’s a different context but I think that that it would be the reductio ad absurdum of this position.

Michael Geist:
You know the implications are enormous not just in copyright but but in other ways as well and really would change the way in which arguably change the way in which many people access the Internet are able to access the Internet today. So we’ve got claims both that people are infringing and then they’re using a network that is allowing for infringement and in doing so can be seen to be liable on those grounds. Have any of these cases gone to trial or any of these issues been tested yet.

James Plotkin:
No in fact in Canada all of these issues remain untested and all of the cases I was talking about have not passed the pleading stage yet and there’s no sign that they will anytime soon on the basis of the way things have been going because in several of these matters there’s an initial wave of Norwich orders and then people are sued and served and then there’s another wave maybe six months later and another wave six months later. And so you know the pool of defendants grows and grows. But ultimately this is the rights holders way of maximizing perhaps settlement opportunities by kind of having waves of notices and statements of claims that go out and so it kind of refreshes the settlements incentive pool if I could put it that way. I will say also that in the United States the 9th Circuit Court and the courts in California heard a case that is somewhat similar and it’s the law there is not exactly the same and I won’t get into the use idiosyncrasies but the court there found that a merely being the internet subscriber was insufficient to state a claim for contributory infringement which is again not the same but roughly analogous to authorization in the Canadian context.

James Plotkin:
And so again while there might be points of distinction I think that’s helpful. And I would also note that in the Rogers case and this was arguably obiter dictum, Justice Brown who wrote for the majority acknowledged at paragraph thirty five of the decision that there may be instances in which the person who receives the notice of being the subscriber is not in his words in fact will not have illegally share the copyrighted content online. And then they say that he says this might occur for example where one I have IP address while registered to a person who receives the notice is available for use by a number of other individuals and so without necessarily coming out and saying it it seems to me that Justice Brown is leaving room for the possibility that simply being a subscriber once again is not enough to amount to authorization and I expect that that is a point that will become live should these cases ever move forward to adjudication.

Michael Geist:
So we’ve got some there’s some significant arguments that can be raised certainly on the authorization side clearly including fairly recent Supreme Court jurisprudence that raises some real doubts about that legal theory and we’ve also got some questions even on the other side of their their legal theory does that suggest that Internet users can simply ignore this on the notice and notice side, it’s just a notice they can think about their behavior but they don’t have to respond quite clearly. What about if they happen to receive one of these lawsuits.

James Plotkin:
Well I. And again without advising or not advising it would be generally unavoidable to ignore a statement of claim because it would permit the plaintiff to proceed to get a default judgment against the individual which limits or or basically ends their rights to participate in the proceeding unless the default is lifted and then damages can be ordered against them in their absence and of course an undefended proceeding the damages are likely to be higher. And you know that that’s probably not a good strategy to ignore these notices.

Michael Geist:
Fair enough. And what kind of damages are we talking about in a copyright context.

James Plotkin:
Well again to the extent that the commerce the infringement is non-commercial in nature the well let me take a step back. Under the Copyright Act there are a whole suite of remedies. And with respect to the monetary remedies the rights holder can generally elect either between receiving ordinary common law damages which is damages for lost profits and lost lost opportunity and an accounting of profits which is basically a disgorgment as we say in law of the profits that the infringer made. So that’s one option. The other option to elect is for statutory damages which alleviates the burden of the plaintiff to actually prove any causal connection with the damages and it creates a range. But the law sets out a number of factors including you know the good and bad faith of the parties the need for deterrence a few other things for the court to figure where within the statutory damages range a an award should sit in a given instance based on those facts and again for non-commercial infringement. The range is one hundred dollars to five thousand dollars for all infringements. And that’s important because this means that in principle a person can infringe copyright for a number of years and then it’s really a race to the court. And the first rights holder who seeks to sue may do so and that essentially bars any other rights holder for infringements that occurred up until that point from suing. And then the liability can be renewed after that point but and then there’s commercial infringement where the range is significantly larger it’s between five hundred and twenty thousand dollars on a per infringement basis and there’s no same first a court rule there. Now in most of these cases these would almost certainly all be non-commercial infringements. I think there are some arguments as to whether a landlord who supplies internet is make renders it a commercial infringement and I’m sure that the plaintiffs will make that argument but the case on that is not settled at all either. So certainly for the overwhelming majority of people who receive these statements have claimed their maximum liability almost certainly going to be capped at five thousand dollars. And frankly that would be a pretty high damage award for something like this.

Michael Geist:
Ok so so not something that anybody can ignore, but on the non-commercial side the range is hundred dollars at the low end five thousand dollars at the highest end but it sounds unlikely or at least would be unusual to see a judge in a single case go for five thousand dollars. But of course this hasn’t been tested yet.

James Plotkin:
Correct. It could. It could happen. But again this is not the sort of financial liability where people have to you know start selling their belongings here. That’s that’s not the issue. But but it is a real court action and it’s not something to be ignored in my view.

Michael Geist:
James thanks so much for joining me on the podcast.

James Plotkin:
My pleasure.

Michael Geist:
That’s the Law Bytes podcast for this week. If you have comments suggestions or other feedback, write to lawbytes.com. That’s L.A. W B Y TE S at P.O. Box dot com. Follow the podcast on Twitter at @lawbytespod or Michael Geist at @mgeist. You can download the latest episodes from my Web site at Michaelgeist.ca or subscribe via RSS, at Apple podcast, Google, or Spotify. The LawBytes Podcast is produced by Gerardo LeBron Laboy. Music by the Laboy brothers: Gerardo and Jose LeBron Laboy. Credit information for the clips featured in this podcast can be found in the show notes for this episode at Michaelgeist.ca. I’m Michael Geist. Thanks for listening and see you next time.

Convert audio to text with Sonix. Sonix is the best online audio transcription software

Sonix accurately transcribed the audio file, “LawBytes Podcast, Episode10.mp3” , using cutting-edge AI. Get a near-perfect transcript in minutes, not hours or days when you use Sonix. Sonix is the industry-leading audio-to-text converter. Signing up for a free trial is easy.

Convert mp3 to text with Sonix

For audio files (such as “LawBytes Podcast, Episode10.mp3”), thousands of researchers and podcasters use Sonix to automatically transcribe mp3 their audio files. Easily convert your mp3 file to text or docx to make your media content more accessible to listeners.

Best audio transcription software: Sonix

Researching what is “the best audio transcription software” can be a little overwhelming. There are a lot of different solutions. If you are looking for a great way to convert mp3 to text , we think that you should try Sonix. They use the latest AI technology to transcribe your audio and are one my favorite pieces of online software.

The post The LawBytes Podcast, Episode 10: Lowdown on Lawsuits – James Plotkin on Copyright Threats, Notices, and Lawsuits appeared first on Michael Geist.

Perlsphere: Renaming modules from the command line

As we continue to build Tau Station, the free-to-play narrative sci-fi MMORPG (what a mouthful!), we find our code base is getting larger and larger. As with any large codebase, we've found ourselves sometimes mired in technical debt and need a way out.

One simple hack I wrote has saved so much grief with this process. I can run this from the command line:

  bin/dev/rename-module.pl Veure::Old::Module::Name Veure::New::Module::Name

And it automatically renames the module for me, updates all references to it, creating directories as needed, renaming test classes built for it, and uses git.

This is that script:

It is, of course, a bit of a hack and obviously uses a few custom tools we've built ourselves, but curiously, it hasn't botched a rename a single time, including one I did this morning which resulted in a diff of over 1,000 lines.

Though "lines of code" isn't a great metric, if you count our Perl, our Javascript, our CSS, and our templates, we're pushing in on half a million lines of code in Tau Station. Even tiny tools like this can make life much easier. Now that we're not afraid to make sweeping namespace reorganizations, it's easier to keep on top of technical debt.

Trivium: 06may2019

Perlsphere: List of new CPAN distributions – Apr 2019

dist author version abstract
Acme-CPANModules-PERLANCAR-RsyncEnhancements PERLANCAR 0.001 List of my enhancements for rsync
Acme-CPANModules-VersionNumber-Perl PERLANCAR 0.001 Working with Perl version numbers (or version strings)
Algorithm-Backoff PERLANCAR 0.003 Various backoff strategies for retry
Algorithm-MinPerfHashTwoLevel YVES 0.01 construct a "two level" minimal perfect hash
Algorithm-Retry PERLANCAR 0.001 Various retry/backoff strategies
Alien-Build-Plugin-Decode-Mojo PLICEASE 0.01 Plugin to extract links from HTML using Mojo::DOM or Mojo::DOM58
Alien-Build-Plugin-Download-GitHub PLICEASE 0.01 Alien::Build plugin to download from GitHub
Alien-Plotly-Orca SLOYD 0.0000_01 Finds or installs plotly-orca
Alien-Proj4 ETJ 2.019103 Give install info for already-installed proj4
Alien-cmark DBOOK 0.001 Alien wrapper for cmark
Anansi-Script-JSON ANANSI 0.01 Defines the mechanisms specific to handling JSON-RPC.
Anansi-Script-XML ANANSI 0.01 Defines the mechanisms specific to handling XML-RPC.
Anansi-ScriptComponent ANANSI 0.01 A manager template for Perl script interface interactions.
AnyEvent-HTTPD-Router UFOBAT 1.0.0 Adding Routes to AnyEvent::HTTPD
App-FileCommonUtils PERLANCAR 0.001 CLIs for File::Common
App-FileCreateLayoutUtils PERLANCAR 0.001 CLIs for File::Create::Layout
App-HTTPUserAgentUtils PERLANCAR 0.001 CLI utilities related to HTTP User-Agent (string)
App-IniDiff SQUIRESJM 0.16 App::IniDiff, scripts to diff and patch INI files
App-IniDiff-IniFile SQUIRESJM 0.16 App::IniDiff, scripts to diff and patch INI files
App-LWPUtils PERLANCAR 0.001 Command-line utilities related to LWP
App-PTP MATHIAS 1.00 An expressive Pipelining Text Processor
App-Spoor RORYMCK 0.01 A CPanel client for the Spoor service
App-TSVUtils PERLANCAR 0.001 CLI utilities related to TSV
App-Task DMUEY 0.01 Nest tasks w/ indented output and pre/post headers
App-optex-textconv UTASHIRO 0.01 module to replace document file by its text contents
App-pcorelist TINITA 0.001 Wrapper around corelist with subcommands and tab completion
Arango-Tango AMBS 0.007 A simple interface to ArangoDB REST API
Bencher-Scenario-CSVParsingModules PERLANCAR 0.001 Benchmark CSV parsing modules
Bencher-Scenario-CallStack PERLANCAR 0.001 Benchmark different methods to produce call stack
Bencher-Scenario-Caller PERLANCAR 0.001 Benchmark some variations of caller()
Bencher-Scenario-INIParsingModules PERLANCAR 0.001 Benchmark INI parsing modules
Bencher-Scenario-TSVParsingModules PERLANCAR 0.001 Benchmark TSV parsing modules
Bencher-Scenarios-RangeIterators PERLANCAR 0.001 Benchmark various range iterators
CBOR-Free FELIPE 0.01 Fast CBOR for everyone
CSV-Examples PERLANCAR 0.001 Example CSV files
Catalyst-View-TT-Progressive LNATION 0.01 Control the wrapper
Convert-BER-XS MLEHMANN 0.0 very low level BER decoding
DBIx-Class-Events GSG 0.9.1 Store Events for your DBIC Results
DBIx-Class-Helper-SimpleStats RRWO v0.1.0 Simple grouping and aggregate functions for DBIx::Class
Database-Async-Engine-PostgreSQL TEAM 0.001 PostgreSQL support for Database::Async
Devel-Cover-Report-SonarGeneric TOMK 0.1 SonarQube generic backend for Devel::Cover
Devel-Optic BTYLER 0.001 JSON::Pointer meets PadWalker
Device-RAID-Poller VVELOX v0.0.0 Runs various RAID checks for polling purposes.
Dita-GB-Standard PRBRENAN 20190429 The Gearhart-Brenan Dita Topic Content Naming Convention..
FSM-Mouse POTATOGIM 0.1 Finite-State Machine with Mouse
File-Common PERLANCAR 0.001 List files that are found in {all,more than one} directories
File-TVShow-Info BANS v0.01.0.0 Perl meta data extractor from file name for TV Show file.
File-TVShow-Organize BANS 0.32 Perl module to move TVShow Files into their matching Show Folder on a media server.
Furl-PSGI MHOWARD 0.01 Furl, but wired to a PSGI app
Future-IO PEVANS 0.01 Future-returning IO methods
Geo-Index AKH v0.0.7 Geographic indexer
HTTP-Tiny-Cache_CustomRetry PERLANCAR 0.001 Cache response + retry failed request
HTTP-Tiny-CustomRetry PERLANCAR 0.001 Retry failed HTTP::Tiny requests
HTTP-Tiny-Patch-Delay PERLANCAR 0.001 Add sleep() between requests to slow down
HTTP-Tiny-Patch-Plugin PERLANCAR 0.001 Change use of HTTP::Tiny to that of HTTP::Tiny::Plugin
HTTP-Tiny-Patch-SetUserAgent PERLANCAR 0.001 Set User-Agent header
HTTP-Tiny-Plugin PERLANCAR 0.001 HTTP::Tiny with plugins
HTTP-Tiny-Plugin-Cache PERLANCAR 0.001 Cache HTTP::Tiny responses
HTTP-Tiny-Plugin-CustomRetry PERLANCAR 0.001 Retry failed request
HTTP-Tiny-Plugin-Delay PERLANCAR 0.001 Delay/sleep between requests
HTTP-Tiny-Plugin-Retry PERLANCAR 0.001 Retry failed request
HTTP-UserAgentStr-Util-ByNickname PERLANCAR 0.001 Get popular HTTP User-Agent string by nickname
INI-Examples PERLANCAR 0.001 Example INI configuration files
IO-Async-Process-GracefulShutdown PEVANS 0.01 controlled shutdown of a process
IPC-ReadpipeX DBOOK 0.001 List form of readpipe/qx/backticks for capturing output
IPC-Run-Patch-Setuid PERLANCAR 0.001 Set EUID
Inline-Pdlapp ETJ 2.019105 a simple PDLA module containing inlined Pdlapp code
LWP-UserAgent-Patch-Delay PERLANCAR 0.001 Sleep() between requests
LWP-UserAgent-Patch-Plugin PERLANCAR 0.001 Change use of LWP::UserAgent to that of LWP::UserAgent::Plugin
LWP-UserAgent-Plugin PERLANCAR 0.001 LWP::UserAgent with plugins
LWP-UserAgent-Plugin-Cache PERLANCAR 0.001 Cache LWP::UserAgent responses
LWP-UserAgent-Plugin-Delay PERLANCAR 0.001 Delay/sleep between requests
List-AutoNumbered CXW 0.000001 Add line numbers to lists while creating them
Log-Dispatch-Email-Siffra LUIZBENE 0.01 Module abstract (<= 44 characters) goes here
Math-Lapack RUISTEVE 0.001 Perl interface to LAPACK
Matrix-Simple OXJGOUO 1.04 Some simple matrix operations
Mojo-Graphite-Writer JBERGER 0.01 A non-blocking Graphite metric writer using the Mojo stack
Mojo-Redfish-Client JBERGER 0.01 A Redfish client with a Mojo flair
Mojolicious-Plugin-Statsd MHOWARD 0.03 Emit to Statsd, easy!
Mojolicious-Plugin-Syslog JHTHORSEN 0.01 A plugin for enabling a Mojolicious app to log to syslog
Mojolicious-Static-Role-Compressed SRCHULO 0.01 Role for Mojolicious::Static that serves pre-compressed versions of static assets
Net-Checkpoint-Management-v1 ABRAXXA 0.001001 Checkpoint Management API version 1.x client library
OpenStack-MetaAPI ATOOMIC 0.001 Perl5 OpenStack API abstraction on top of OpenStack::Client
PDLA-GIS ETJ 2.019104 Perl Data Language
PDLA-Rest ETJ 2.013004 Perl Data Language
PDLA-Transform ETJ 2.019106 Perl Data Language
Paubox_Email_SDK VIGHNESH 1.0 Perl wrapper for the Paubox Transactional Email API (https://www.paubox.com/solutions/email-api).
PerlIO-normalize COFFEE 0.001 PerlIO layer to normalize unicode strings on input and output
PerlX-ifor PERLANCAR 0.001 A version of for() that accepts iterator instead of list
Proch-Cmd PROCH 0.001 Execute shell commands controlling inputs and outputs
Proch-N50 PROCH 0.01 Calculate N50 from a FASTA or FASTQ file
QuadPres SHLOMIF 0.28.0 a presentation / slides manager.
Range-ArrayIter PERLANCAR 0.001 Generate a tied-array iterator for range
Range-Iter PERLANCAR 0.001 Generate a coderef iterator for range
Range-Iterator PERLANCAR 0.001 Generate an iterator object for range
SMS-Send-BudgetSMS TASKULA 0.01 SMS::Send driver to send messages via BudgetSMS (ttps://www.budgetsms.net/)
SQL-Abstract-Prefetch ETJ 0.001 implement "prefetch" for DBI RDBMS
Serge-Sync-Plugin-TranslationService-locize DRAGOSV 0.900.0 Locize (https://locize.com/ synchronization plugin for Serge
Serge-Sync-Plugin-TranslationService-lokalise DRAGOSV 0.900.0 Lokalise (https://lokalise.co/) synchronization plugin for Serge
Software-Catalog-SW-filezilla PERLANCAR 0.001 FileZilla
Software-Catalog-SW-phpmyadmin PERLANCAR 0.001 phpMyAdmin
Starwoman ASHLEYW 0.001 because Starman does the same thing over and over again expecting different results
TSV-Examples PERLANCAR 0.001 Example TSV files
Telugu-Keyword RAJ 0.03 Perl extension to provide Telugu keywords
Telugu-TGC RAJ 0.03 Perl extension for tailored grapheme clusters for Telugu script.
Test-FITesque-RDF KJETILK 0.001 Formulate Test::FITesque fixture tables in RDF
Test2-Harness-UI EXODIST 0.000001 Web interface for viewing and inspecting yath test logs
Text-CSV-FromAOH PERLANCAR 0.001 Convert an AoH (array of hashes) to CSV
Tie-Array-File-LazyRead PERLANCAR 0.001 Read a file record by record using tied array and for()
Tie-Array-Log PERLANCAR 0.001 Tied array that behaves like a regular array, but logs operations
URI-s3 KARUPA v0.01 s3 URI scheme
WordList-HTTP-UserAgentString-PERLANCAR PERLANCAR 0.001 A selection of some HTTP User-Agent strings
Yancy-Backend-Static PREACTION 0.001 Build a Yancy site from static Markdown files
cluster MDEHOON 1.58 Perl interface to the C Clustering Library.
p6-RandomColor CBWOOD 0
tap MHOWARD 0.01 add a tap method everywhere
version-dev PERLANCAR 0.001 Set $VERSION based on version from git tags

Explosm.net: Comic for 2019.05.06

New Cyanide and Happiness Comic

Disquiet: Synth Learning: Muxlicer Piano (First Patch)

This is my first patch with a new module, a device from the manufacturer Befaco called the Muxplicer. It is capable of many things involving slicing up an incoming signal and effecting changes upon it, such as triggering all sorts of percussive cues. In this case what’s happening is a sample of an electric piano is being triggered every eight beats, and then for each of those beats (pulses, really) various things occur. In most of the cases it’s a matter of the volume level shifting, but in two cases (that is, on two of the pulses) some heavy reverb is put upon it. In addition, a sliver of of the signal is being sampled and replayed in a glitch manner at a lower volume. (Technically the first slice was the same patch processing live guitar chords, but I decided to use a sample playback on this initial round.)

Perlsphere: drawing (but not generating) mazes

I've started a sort of book club here in Philly. It works like this: it's for people who want to do computer programming. We pick books that have programming problems, especially language-agnostic books, and then we commit to showing up to the book club meeting with answers for the exercises. There, we can show off our great work, or ask for help because we got stuck, or compare notes on what languages made things easier or harder. We haven't had an actual meeting yet, so I have no idea how well this is going to go.

Our first book is Mazes for Programmers, which I started in on quite a while ago but didn't make much progress through. The book's examples are in Ruby. My goal is to do work in a language that I don't know well, and I know Ruby well enough that it's a bad choice. I haven't decided yet what I'll do most of the work in, but I didn't want to do it all in Perl 5, which I already know well, and reach for when solving most daily problems. On the other hand, I knew a lot of the early material in the book (and maybe lots of the material in general) would be on generating mazes, which would be fairly algorithmic work and produce a data structure. I didn't want to get all caught up in drawing the data structure as a human-friendly maze, since that seemed like it would be a fiddly problem and would distract me from the actual maze generation.

This weekend, I wrote a program in Perl 5 that would take a very primitive description of a maze on standard input and print out a map on standard output. It was, as I predicted, sort of fiddly and tedious, but when I finished, I felt pretty good about it. I put my maze-drawing program on GitHub, but I thought it might be fun to write up what I did.

First, I needed a simple protocol. My goal was to accept input that would be easy to produce given any data structure describing a maze, even if it would be a sort of stupid format to actually store a maze in. I went with a line-oriented format like this:

    1 2 3
  4 5 6
  7 8 9

Every line in this example is row of three rooms in the maze. This input would actually be illegal, but it's a useful starting point. Every room in the maze is represented by an integer, which in turn represents a four-bit bitfield, where each bit tell us whether the room links in the indicated direction

      1
   8•2
    4

So if a cell in the maze has passages leading south and east, it would be represented in the file by a 6. This means some kinds of input are nonsensical. What does this input mean?

    0 0 0
  0 2 0
  0 0 0

The center cell has a passage east, but the cell to its east has no passage west. Easy solution: this is illegal.

I made quite a few attempts to get from that to a working drawing algorithm. It was sort of painful, and I ended up feeling pretty stupid for a while. Eventually, though, I decided that the key was not to draw cells (rooms), but to draw lines. That meant that for a three by three grid of cells, I'd need to draw a four by four grid of lines. It's that old fencepost problem.

     1   2   3   4
 1 +---+---+---+
   | 0 | 0 | 0 |
 2 +---+---+---+
   | 0 | 2 | 8 |
 3 +---+---+---+
   | 0 | 0 | 0 |
 4 +---+---+---+

Here, there's only one linkage, so really the map could be drawn like this:

     1   2   3   4
 1 +---+---+---+
   | 0 | 0 | 0 |
 2 +---+---+---+
   | 0 | 2   8 |
 3 +---+---+---+
   | 0 | 0 | 0 |
 4 +---+---+---+

My reference map while testing was:

     1   2   3   4
 1 +---+---+---+
    10  12 | 0 |
 2 +---+   +---+
   | 0 | 5 | 0 |
 3 +---+   +---+
   | 0 | 3  12 |
 4 +---+---+   +

This wasn't too, too difficult to get, but it was pretty ugly. What I actually wanted was something drawn from nice box-drawing characters, which would look like this:

     1   2   3   4
 1 ╶───────┬───┐
    10  12 │ 0 │
 2 ┌───┐   ├───┤
   │ 0 │ 5 │ 0 │
 3 ├───┤   └───┤
   │ 0 │ 3  12 │
 4 └───┴───╴   ╵

Drawing this was going to be trickier. I couldn't just assume that every intersection was a +. I needed to decide how to pick the character at every intersection. I decided that for every intersection, like (2,2), I'd have to decide the direction of lines based on the links of the cells abutting the intersection. So, for (2,2) on the line axes, I'd have to look at the cells at (2,1) and (2,2) and (1,2) and (1,1). I called these the northeast, southeast, southwest, and northwest cells, relative to the intersection, respectively. Then determined whether a line extended from the middle of an intersection in a given direction as follows:

    # Remember, if the bit is set, then a link (or passageway) in that
  # direction exists.
  my $n = (defined $ne && ! ($ne & WEST ))
       || (defined $nw && ! ($nw & EAST ));
  my $e = (defined $se && ! ($se & NORTH))
       || (defined $ne && ! ($ne & SOUTH));
  my $s = (defined $se && ! ($se & WEST ))
       || (defined $sw && ! ($sw & EAST ));
  my $w = (defined $sw && ! ($sw & NORTH))
       || (defined $nw && ! ($nw & SOUTH));

For example, how do I know that at (2,2) the intersection should only have limbs headed west and south? Well, it has cells to the northeast and northwest, but they link west and east respectively, so there can be no limb headed north. On the other hand, the cells to its southeast and southwest do not link to one another, so there is a limb headed south.

This can be a bit weird to think about, so think about it while looking at the map and code.

Now, for each intersection, we'd have a four-bit number. What did that mean? Well, it was easy to make a little hash table with some bitwise operators and the Unicode character set…

    my %WALL = (
    0     | 0     | 0     | 0     ,=> ' ',
    0     | 0     | 0     | WEST  ,=> '╴',
    0     | 0     | SOUTH | 0     ,=> '╷',
    0     | 0     | SOUTH | WEST  ,=> '┐',
    0     | EAST  | 0     | 0     ,=> '╶',
    0     | EAST  | 0     | WEST  ,=> '─',
    0     | EAST  | SOUTH | 0     ,=> '┌',
    0     | EAST  | SOUTH | WEST  ,=> '┬',
    NORTH | 0     | 0     | 0     ,=> '╵',
    NORTH | 0     | 0     | WEST  ,=> '┘',
    NORTH | 0     | SOUTH | 0     ,=> '│',
    NORTH | 0     | SOUTH | WEST  ,=> '┤',
    NORTH | EAST  | 0     | 0     ,=> '└',
    NORTH | EAST  | 0     | WEST  ,=> '┴',
    NORTH | EAST  | SOUTH | 0     ,=> '├',
    NORTH | EAST  | SOUTH | WEST  ,=> '┼',
  );

At first, I only drew the intersections, so my reference map looked like this:

    ╶─┬┐
  ┌┐├┤
  ├┤└┤
  └┴╴╵

When that worked -- which took quite a while -- I added code so that cells could have both horizontal and vertical fillter. My reference map had a width of 3 and a height of 1, meaning that it was drawn with 1 row of vertical-only filler and 3 columns of horizontal-only drawing per cell. The weird map just above had a zero height and width. Here's the same map with a width of 6 and a height of zero:

    ╶─────────────┬──────┐
  ┌──────┐      ├──────┤
  ├──────┤      └──────┤
  └──────┴──────╴      ╵

I have no idea whether this program will end up being useful in my maze testing, but it was (sort of) fun to write. At this point, I'm mostly wondering whether it will be proven to be terrible later on.

As a side note, my decision to do the drawing in text was a major factor in the difficulty. Had I drawn the maps with a graphical canvas, it would have been nearly trivial. I'd just draw each cell, and then start adjacent cells with overlapping positions. If two walls drew over one another, it would be the intersection of drawn pixels that would display, which would be exactly what we wanted. Text can't work that way, because every visual division of the terminal can show only one glyph. In this way, a typewriter is more like a canvas than a text terminal. When it overstrikes two characters, the intersection of their inked surfaces really is seen. In a terminal, an overstriken character is fully replaced by the overstriking character.

It's all on GitHub, but here's my program as I stands tonight:

    #!perl
  use v5.20.0;
  use warnings;

  use Getopt::Long::Descriptive;

  my ($opt, $usage) = describe_options(
    '%c %o',
    [ 'debug|D',    'show debugging output' ],
    [ 'width|w=i',  'width of cells', { default => 3 } ],
    [ 'height|h=i', 'height of cells', { default => 1 } ],
  );

  use utf8;
  binmode *STDOUT, ':encoding(UTF-8)';

  #  1   A maze file, in the first and stupidest form, is a sequence of lines.
  # 8•2  Every line is a sequence of numbers.
  #  4   Every number is a 4-bit number.  *On* sides are linked.
  #
  # Here are some (-w 3 -h 1) depictions of mazes as described by the numbers
  # shown in their cells:
  #
  # ┌───┬───┬───┐ ╶───────┬───┐
  # │ 0 │ 0 │ 0 │  10  12 │ 0 │
  # ├───┼───┼───┤ ┌───┐   ├───┤
  # │ 0 │ 0 │ 0 │ │ 0 │ 5 │ 0 │
  # ├───┼───┼───┤ ├───┤   └───┤
  # │ 0 │ 0 │ 0 │ │ 0 │ 3  12 │
  # └───┴───┴───┘ └───┴───╴   ╵

  use constant {
    NORTH => 1,
    EAST  => 2,
    SOUTH => 4,
    WEST  => 8,
  };

  my @lines = <>;
  chomp @lines;

  my $grid = [ map {; [ split /\s+/, $_ ] } @lines ];

  die "bogus input\n" if grep {; grep {; /[^0-9]/ } @$_ } @$grid;

  my $max_x = $grid->[0]->$#*;
  my $max_y = $grid->$#*;

  die "not all rows of uniform length\n" if grep {; $#$_ != $max_x } @$grid;

  for my $y (0 .. $max_y) {
    for my $x (0 .. $max_x) {
      my $cell  = $grid->[$y][$x];
      my $south = $y < $max_y ? $grid->[$y+1][$x] : undef;
      my $east  = $x < $max_x ? $grid->[$y][$x+1] : undef;

      die "inconsistent vertical linkage at ($x, $y) ($cell v $south)"
        if $south && ($cell & SOUTH  xor  $south & NORTH);

      die "inconsistent horizontal linkage at ($x, $y) ($cell v $east)"
        if $east  && ($cell & EAST   xor  $east  & WEST );
    }
  }

  my %WALL = (
    0     | 0     | 0     | 0     ,=> ' ',
    0     | 0     | 0     | WEST  ,=> '╴',
    0     | 0     | SOUTH | 0     ,=> '╷',
    0     | 0     | SOUTH | WEST  ,=> '┐',
    0     | EAST  | 0     | 0     ,=> '╶',
    0     | EAST  | 0     | WEST  ,=> '─',
    0     | EAST  | SOUTH | 0     ,=> '┌',
    0     | EAST  | SOUTH | WEST  ,=> '┬',
    NORTH | 0     | 0     | 0     ,=> '╵',
    NORTH | 0     | 0     | WEST  ,=> '┘',
    NORTH | 0     | SOUTH | 0     ,=> '│',
    NORTH | 0     | SOUTH | WEST  ,=> '┤',
    NORTH | EAST  | 0     | 0     ,=> '└',
    NORTH | EAST  | 0     | WEST  ,=> '┴',
    NORTH | EAST  | SOUTH | 0     ,=> '├',
    NORTH | EAST  | SOUTH | WEST  ,=> '┼',
  );

  sub wall {
    my ($n, $e, $s, $w) = @_;
    return $WALL{ ($n ? NORTH : 0)
                | ($e ? EAST : 0)
                | ($s ? SOUTH : 0)
                | ($w ? WEST : 0) } || '+';
  }

  sub get_at {
    my ($x, $y) = @_;
    return undef if $x < 0 or $y < 0;
    return undef if $x > $max_x or $y > $max_y;
    return $grid->[$y][$x];
  }

  my @output;

  for my $y (0 .. $max_y+1) {
    my $row = q{};

    my $filler;

    for my $x (0 .. $max_x+1) {
      my $ne = get_at($x    , $y - 1);
      my $se = get_at($x    , $y    );
      my $sw = get_at($x - 1, $y    );
      my $nw = get_at($x - 1, $y - 1);

      my $n = (defined $ne && ! ($ne & WEST ))
           || (defined $nw && ! ($nw & EAST ));
      my $e = (defined $se && ! ($se & NORTH))
           || (defined $ne && ! ($ne & SOUTH));
      my $s = (defined $se && ! ($se & WEST ))
           || (defined $sw && ! ($sw & EAST ));
      my $w = (defined $sw && ! ($sw & NORTH))
           || (defined $nw && ! ($nw & SOUTH));

      if ($opt->debug) {
        printf "(%u, %u) -> NE:%2s SE:%2s SW:%2s NW:%2s -> (%s %s %s %s) -> %s\n",
          $x, $y,
          (map {; $_ // '--'  } ($ne, $se, $sw, $nw)),
          (map {; $_ ? 1 : 0 } ($n,  $e,  $s,  $w)),
          wall($n, $e, $s, $w);
      }

      $row .= wall($n, $e, $s, $w);

      if ($x > $max_x) {
        # The rightmost wall is just the right joiner.
        $filler .=  wall($s, 0, $s, 0);
      } else {
        # Every wall but the last gets post-wall spacing.
        $row .= ($e ? wall(0,1,0,1) : ' ') x $opt->width;
        $filler .=  wall($s, 0, $s, 0);
        $filler .= ' ' x $opt->width;
      }
    }

    push @output, $row;
    if ($y <= $max_y) {
      push @output, ($filler) x $opt->height;
    }
  }

  say for @output;

Jesse Moynihan: Tarot Booklet Page 10

VIII JUSTICE The archetype of Justice has always been depicted as female. One of the 4 classical Virtues, she sits between two pillars like The Popess. Her discerning sword is a lightning rod to the Heavens, which she uses to cleave away bullshit. Her action is closely tied to earthly concerns (her robe drapes over […]

OCaml Planet: 7th MirageOS hack retreat

Let's talk sun, mint tea and OCaml: Yes, you got it, the MirageOS biennial retreat at Marrakesh!

For the 7th iteration of the retreat, the majority of the Tarides team took part in the trip to the camels country. This is a report about what we produced and enjoyed while there.

Charles-Edouard Lecat

That's it, my first MirageOS retreat is coming soon, let's jump in the plane and here I come. After a nice cab trip and an uncountable number of similar streets, I'm finally at the Riad which will host me for the next 5 days.

It now begins the time to do what I came for: Code, Eat, Sleep and Repeat

I mostly worked on Colombe, the OCaml implementation of the SMTP protocol for which I developed a simple client. Except some delayed problems (like the integration of the MIME protocol, the TLS wrapping and some others), the client was working perfectly :) Implementing it was actually really easy as the core of the SMTP protocol was done by @dinosaure who developed over time a really nice way of implementing this kind of API. And as I spend most of my time at Tarides working on his code, I feel really comfortable with it.

One of the awesome thing about this retreat was the people who came: There was so many interesting people, doing various thing, so each time someone had an interrogation, you could almost be sure that someone could help you in a way or another.

But sadly, as I arrived few days after everyone, and just before the week-end, the time flew away reaaaaaally fast, and I did not have the time to do some major code, but I'm already looking forward to the next retreat which, I am sure, will be even more fruitful and attract a lot of nice OCaml developers.

Until then, I will just dream about the awesome food I ate there ;)

Lucas Pluvinage

Second Mirage retreat for me, and this time I had plans: make a small web game with Mirage hosted by an ESP32 device. I figured out that there was not canonical way to make an HTTP/Websocket server with Mirage and I didn't want to stick to a particular library.

Instead, I took my time to develop mirage-http, an abstraction of HTTP that can either have cohttp or httpaf as a backend. On top of that, I've build mirage-websocket which is therefore an HTTP server-independant implementation of websockets (indeed this has a lot of redundancies with ocaml-websocket but for now it's a proof of concept). While making all this I discussed with @anmonteiro who's the Webservers/protocols expert for Mirage ! However I didn't have the time to build something on top of that, but this is still something that I would like achieve at some point.

I also became the "dune guy" as I'm working on the Mirage/dune integration, and helped some people with their build system struggles.

It was definitely a rich week, I've learnt a lot of things, enjoyed the sun, ate good food and contributed to the Mirage universe !

Jules Aguillon

This was my first retreat. It was the occasion to meet OCaml developers from all over the world. The food was great and the weather perfect.

I submitted some PRs to the OCaml compiler !

  • Hint on type error on int literal PR #2301

    It's adding an hint when using int literals instead of other number literals (eg. 3 instead of 3. or 3L):

    Line 2, characters 20-21:
    2 | let _ = Int32.add a 3
                            ^
    Error: This expression has type int but an expression was expected of type
             int32
           Hint: Did you mean `3l'?
  • Hint on type error on int operators PR #2307

    Hint the user when using numerical operators for ints (eg. +) on other kind of numbers (eg. float, int64, etc..)

    For example:

    Line 8, characters 8-9:
    8 | let _ = x + 1.
                ^
    Error: This expression has type float but an expression was expected of type
             int
    Line 8, characters 10-11:
    8 | let _ = x + 1.
                  ^
      Hint: Did you mean to use `+.'?
  • Clean up int literal hint PR #2313

    A little cleanup of the 2 previous PRs.

  • Hint when the expected type is wrapped in a ref PR #2319

    An other PR adding an hint: When the user forgot to use the ! operator on ref values:

    Line 2, characters 8-9:
    2 | let b = a + 1
                ^
    Error: This expression has type int ref
           but an expression was expected of type int
      Hint: This is a `ref', did you mean `!a'?

The first 3 are merged now.

Gabriel de Perthuis

For this retreat my plan was to do something a little different and work on Solo5.

Wodan, the storage layer I'm working on, needs two things from its backends which are not commonly implemented:

  • support for discarding unused blocks (first implemented in mirage-block-unix), and
  • support for barriers, which are ordering constraints between writes

Solo5 provides relevant mirage backends, which are themselves provided by various virtualised implementations. Discard was added to most of those, at least those that were common enough to be easily tested; we just added an "operation not supported" error code for the other cases.

The virtio implementation was interesting; recent additions to the spec allow discard support, but few virtual machine managers actually implement that on the backend side. I tried to integrate with the Chromium OS "crosvm" for that, and had a good time figuring out how it found the bootloader entry point (turns out the cpu was happily skipping past invalid instructions to find a slightly misaligned entry point), but ran out of time to figure out the rest of the integration, which seemed to be more complex that anticipated. Because of this virtio discard support will be skipped over for now.

I also visited the souk, which was an interesting experience. Turns out I'm bad at haggling, but I brought back interesting things anyway.

Conclusion

We'd like to thank Hannes Mehnert who organized this retreat and all the attendees who contributed to make it fruitful and inspiring. You want to take part in the next MirageOS retreat? Stay tuned here.

Tea Masters: Où est la version francophone de tea-masters?

Les premiers Oolongs de printemps arrivent, mais ma boutique www.tea-masters.com ne revient toujours pas à sa forme initiale bilingue anglais et français. J'en suis sincèrement désolé. N'y voyez pas un signe que je renie mes origines ou que mes clients francophones me sont indifférents. Au contraire! Certes, mes commandes en anglais sont au moins 2 fois plus nombreuses, mais cela n'explique pas l'arrêt temporaire de ma boutique en français. En fait, j'utilise une plateforme française -Prestashop- pour ma boutique en ligne. Et en début d'année, cette entreprise a demandé à ses clients de migrer vers une nouvelle version de son logiciel. J'acceptai leur proposition et payai leur forfait. Malheureusement, durant la migration, la partie française de la boutique disparut du site original! D'après leurs techniciens, il n'est pas possible de la migrer dans la version simple. Il faut le faire dans la version la plus avancée de leur logiciel. Vous imaginez ma frustration de voir quelque chose de simple devenir compliqué et prendre tant de temps! Mais je persévère afin de récupérer toutes mes données en français. J'espère que le nouveau site sera bientôt prêt et vous remercie de votre compréhension dans cette attente.

Explosm.net: Comic for 2019.05.05

New Cyanide and Happiness Comic

Daniel Lemire's blog: Science and Technology links (May 4th 2019)

  1. It is often believed that colleges help class mobility… if you were born poor, college can help your rise up. So is there more class mobility when more people go to college? Maybe not:

    (…) researchers have characterized a college degree as a great equalizer leveling the playing field, and proposed that expanding higher education would promote mobility. This line of reasoning rests on the implicit assumption that the relatively high mobility observed among college graduates reflects a causal effect of college completion on intergenerational mobility, an assumption that has rarely been rigorously evaluated (…) I find that once selection processes are adjusted for, intergenerational income mobility among college graduates is very close to that among non-graduates.

  2. How many people does it take to form a group? The answer appears to be “at least 5”.
  3. The productivity of a computer science professor is not sensitive to where they got his PhD, but it is correlated with their place of employment.
  4. If you have ever taken management classes, you know about Maslow’s pyramid of human needs. However, Maslow never proposed such a pyramid.
  5. Twenty years ago, Microsoft introduced a mouse that did away with the mechanical ball and replaced with with a digital camera and lights.
  6. Rats were given olive oil, sunflower oil or fish oil. Rats consuming sunflower oil have a shorter lifespan.
  7. Seed oils are a source of Omega 6. It appears that keeping your consumption of Omega 6 low (compared to your Omega 3 intake) is important to reduce cardiovascular events.
  8. Shifting our energy production from coal to gas would allow us to meet our climate stabilization targets.
  9. Many studies find that red meat is associated with higher cancer rates. It is not known whether it is a mere correlation or a causal factor. However, if you narrow down your investigation to Asia, the correlation disappears. The association between red meat and cancer is apparently specific to the Western civilization.
  10. It appears that 60% of all bird species come from Australia.
  11. A new Alzheimer’s-like disease has been identified (it is called “LATE”).

Explosm.net: Comic for 2019.05.04

New Cyanide and Happiness Comic

Disquiet: Synth Learning: Distant Train Vapor Trail Fragments

This is a single field recording of a train horn (source below) being warped simultaneously in various ways by my modular synthesizer. One strand is going through a reverb, another through granular synthesis, another through a lo-fi looper, and another through a spectral filter, and then there’s the unadulterated line, and all of those are being warped or tweaked themselves. The shape of the overall sound, for example, is affecting the density of the granular synthesis. Various LFOs are adjusting the relative prominence of different elements, and other aspects such as the size of the reverb.

The source audio is from freesound.org. It is a distant train horn recorded by Andy Brannan in Wyoming.

: A long walk away

This shot is taken right where we got off our car as we entered the scenic park area – normal cars are not allowed to go in there now, you must take a bus. The only cars allowed in are tea producers’ with the right permit. These are guys whose job it is to haul the raw, freshly plucked leaves from the tea trees inside the scenic area to the waiting cars/trucks, and then hauling them to the factories for processing. Each of the load are 75-100kg. They’re heavy.

These are migrant labourers, mostly from neighbouring Jiangxi province, who get paid something like 200-300 RMB a day to do this work. Sure, the daily wage is fairly high, but it’s back breaking, and it’s only good for a few weeks of the year. For a good amount of the distance they have to travel, it’s on the paved walkway that you see – this was built in the 80s to accommodate tourists coming here. But then, at some point, you have to take a fork and up some dirt path along sometimes fairly steep hill, and then imagine coming back down from the hill with 100kg of tea on your back, in raining condition so that the paths are super slippery. This is not easy work at all.

The path was filled with people doing tea work that day. The teas in this area are all zhengyan, so they’re expensive and worth picking by hand. This means a lot of people are needed, and since everything is done by foot and the distance can be quite long, you need a lot of people very quickly to get things done. There are also some tourists who are just walking through the path, taking in the scenery.

Tea trees are planted all along the path wherever there’s space. Until the 90s, there were a lot of villages inside this park area. People who planted (and still own) these tea trees were all villagers here. Back then, the primary processing areas – basically a shed and a few simple tools – were all very close to the tea planting area. Then the government moved all the villages out of the scenic area for protection of the area itself, because people pollute. So nowadays, the leaves have to be carried way out to get any processing done.

We took a path up a hill towards a plot of trees owned by a tea producer friend. When you walk into this narrow valley, it looks a bit like this

The valley is quite long and narrow, with steep hills on both sides. There’s only one way in and out. There’s a little stream of water flowing in the middle. It’s really ideal ground for tea plants – damp, cool, lots of shade, etc. It’s no wonder the tea here is good. Hiking up the hill to get here takes a bit of effort if you’re not in good shape, and I can’t imagine doing it with 100kg on my back. Many of the trees here are decades old. Prices, of course, match the quality.

One thing that surprised me is that they don’t really care if it’s raining – they’ll harvest anyway. If it’s very early in the season, and the rain is projected to stop very soon after, sometimes they’ll wait, but more often than not they’ll just harvest away – because there’s a lot of tea to harvest, and there just isn’t enough manpower and time to do it all at once. If you wait too long, the leaves get old and then it’s valueless. Since they only harvest once a year, that’s a lot of lost income if you let the leaves go. So, harvest they must, even if the processing is tougher and quality perhaps less than optimal.

We entered the park around 1:30, and didn’t leave till close to 6pm when the sun was setting. All the workers were finishing up as well, waiting for trucks to pick them up to go back, only to come back early the next morning to start again. Even then, I saw a couple workers taking photos of the scenery around them – even though it’s back breaking work, the natural beauty here is impressive and perhaps makes the work just a little more pleasant.

The Shape of Code: C Standard meeting, April-May 2019

I was at the ISO C language committee meeting, WG14, in London this week (apart from the few hours on Friday morning, which was scheduled to be only slightly longer than my commute to the meeting would have been).

It has been three years since the committee last met in London (the meeting was planned for Germany, but there was a hosting issue, and Germany are hosting next year), and around 20 people attended, plus 2-5 people dialing in. Some regular attendees were not in the room because of schedule conflicts; nine of those present were in London three years ago, and I had met three of those present (this week) at WG14 meetings prior to the last London meeting. I had thought that Fred Tydeman was the longest serving member in the room, but talking to Fred I found out that I was involved a few years earlier than him (our convenor is also a long-time member); Fred has attended more meeting than me, since I stopped being a regular attender 10 years ago. Tom Plum, who dialed in, has been a member from the beginning, and Larry Jones, who dialed in, predates me. There are still original committee members active on the WG14 mailing list.

Having so many relatively new meeting attendees is a good thing, in that they are likely to be keen and willing to do things; it’s also a bad thing for exactly the same reason (i.e., if it not really broken, don’t fix it).

The bulk of committee time was spent discussing the proposals contains in papers that have been submitted (listed in the agenda). The C Standard is currently being revised, WG14 are working to produce C2X. If a person wants the next version of the C Standard to support particular functionality, then they have to submit a paper specifying the desired functionality; for any proposal to have any chance of success, the interested parties need to turn up at multiple meetings, and argue for it.

There were three common patterns in the proposals discussed (none of these patterns are unique to the London meeting):

  • change existing wording, based on the idea that the change will stop compilers generating code that the person making the proposal considers to be undesirable behavior. Some proposals fitting this pattern were for niche uses, with alternative solutions available. If developers don’t have the funding needed to influence the behavior of open source compilers, submitting a proposal to WG14 offers a low cost route. Unless the proposal is a compelling use case, affecting lots of developers, WG14’s incentive is to not adopt the proposal (accepting too many proposals will only encourage trolls),
  • change/add wording to be compatible with C++. There are cost advantages, for vendors who have to support C and C++ products, to having the two language be as mutually consistent as possible. Embedded systems are a major market for C, but this market is not nearly as large for C++ (because of the much larger overhead required to support C++). I pointed out that WG14 needs to be careful about alienating a significant user base, by slavishly following C++; the C language needs to maintain a separate identity, for long term survival,
  • add a new function to the C library, based on its existence in another standard. Why add new functions to the C library? In the case of math functions, it’s to increase the likelihood that the implementation will be correct (maths functions often have dark corners that are difficult to get right), and for string functions it’s the hope that compilers will do magic to turn a function call directly into inline code. The alternative argument is not to add any new functions, because the common cases are already covered, and everything else is niche usage.

At the 2016 London meeting Peter Sewell gave a presentation on the Cerberus group’s work on a formal definition of C; this work has resulted in various papers questioning the interpretation of wording in the standard, i.e., possible ambiguities or inconsistencies. At this meeting the submitted papers focused on pointer provenance, and I was expecting to hear about the fancy optimizations this work would enable (which would be a major selling point of any proposal). No such luck, the aim of the work was stated as clearly specifying the behavior (a worthwhile aim), with no major new optimizations being claimed (formal methods researchers often oversell their claims, Peter is at the opposite end of the spectrum and could do with an injection of some positive advertising). Clarifying behavior is a worthwhile aim, but not at the cost of major changes to existing wording. I have had plenty of experience of asking WG14 for clarification of existing (what I thought to be ambiguous) wording, only to be told that the existing wording was clear and not ambiguous (to those reviewing my proposed defect). I wonder how many of the wording ambiguities that the Cerberus group claim to have found would be accepted by WG14 as a defect that required a wording change?

Winner of the best pub quiz question: Does the C Standard require an implementation to be able to exactly represent floating-point zero? No, but it is now required in C2X. Do any existing conforming implementations not support an exact representation for floating-point zero? There are processors that use a logarithmic representation for floating-point, but I don’t know if any conforming implementation exists for such systems; all implementations I know of support an exact representation for floating-point zero. Logarithmic representation could handle zero using a special bit pattern, with cpu instructions doing the right thing when operating on this bit pattern, e.g., 0.0+X == X, (I wonder how much code would break, if the compiler mapped the literal 0.0 to the representable value nearest to zero).

Winner of the best good intentions corrupted by the real world: intmax_t, an integer type capable of representing any value of any signed integer type (i.e., a largest representable integer type). The concept of a unique largest has issues in a world that embraces diversity.

Today’s C development environment is very different from 25 years ago, let alone 40 years ago. The number of compilers in active use has decreased by almost two orders of magnitude, the number of commonly encountered distinct processors has shrunk, the number of very distinct operating systems has shrunk. While it is not a monoculture, things appear to be heading in that direction.

The relevance of WG14 decreases, as the number of independent C compilers, in widespread use, decreases.

What is the purpose of a C Standard in today’s world? If it were not already a standard, I don’t think a committee would be set up to standardize the language today.

Is the role of WG14 now, the arbiter of useful common practice across widely used compilers? Documenting decisions in revisions of the C Standard.

Work on the Cobol Standard ran for almost 60-years; WG14 has to be active for another 20-years to equal this.

Disquiet: Sleepless in San Francisco

I Dream of Wires is the incorrect title for that great documentary on modular synthesizers.

It should be titled: Couldn’t Sleep Because I Read Manuals Too Late at Night and My Brain Wouldn’t Stop Patching.

. . .

In the meanwhile, here is where my system is currently at (or will be when two of those modules arrive from, respectively, Poland and down in Southern California). The blank space at the bottom is, indeed, bank. Likely other modules will go before it is filled. We’ll see.

Disquiet: RSS and Its Contents

For the past month, for unclear reasons, the RSS feed of Disquiet.com wasn’t functioning well — or at least, little to nothing showed up on Feedly.com for many weeks in a row. In any case, that issue seems to have been corrected or otherwise resolved, hence the test post (since deleted) you may have noticed yesterday if you use RSS. Here are some of the stories that popped up on Disquiet.com during the blackout:

  • I had the privilege of appearing on Darwin Grosse’s excellent podcast, Art + Music + Technology.
  • The gorgeous new album from Rotterdam-based musician Michel Banabila, titled Uprooted, features a short essay from me as its liner notes.
  • Graffiti in Sacramento, California, suggests a P2P underground.
  • Marcus Fischer had a beautiful installation at a Portland gallery involving speaker cones and seed pods.
  • Louise Rossiter (based in Leicester, U.K.) is sharing the music she’s making as she learns the coding language Supercolider.
  • The Japanese translation of my Aphex Twin book was spotted in Tokyo again.
  • OUR VALUED CUSTOMERS: While discussing AVENGERS: ENDGAME...

    Daniel Lemire's blog: Really fast bitset decoding for “average” densities

    Suppose I give you a word and you need to determine the location of the 1-bits. For example, given the word 0b100011001, you would like to get 0,3,4,8.

    You could check the value of each bit, but that would take too long. A better approach is use the fact that modern processors have fast instructions to count the number of “trailing zeros” (on x64 processors, you have tzcnt). Given 0b100011001, this instruction would give you 0. Then you if you set this first bit to zero (getting 0b100011000), the trailing-zero instruction gives you 3, and so forth. Conveniently enough, many processors can set the least significant 1-bit to zero using a single instruction (blsr); you can implement the desired operation in most programming languages like C as a bitwise AND: word & (word - 1).

    Thus, the following loop should suffice and it is quite efficient…

      while (word != 0) {
        result[i] = trailingzeroes(word);
        word = word & (word - 1);
        i++;
      }
    

    How efficient is it exactly?

    To answer this question, we first need to better define the problem. If the words you are receiving have few 1-bits (say less than one 1-bit per 64-bit words), then you have the sparse regime, and it becomes important to detect quickly zero inputs, for example. If half of your bits are set, you have the dense regime and it is best handled using using vectorization and lookup tables.

    But what do you do when your input data is neither really sparse (that is, you almost never have zero inputs) nor really dense (that is, most of your bits are set to zero)? In such cases, the fact that the instructions in your loop are efficient does not help you as much as you’d like because you have another problem: almost every word will result in at least one mispredicted branch. That is, your processor has a hard time predicting when the loop will stop. This prevent your processor from doing a good job retiring instructions.

    You can try to have fewer branches at the expense of more instructions:

      while (word != 0) {
        result[i] = trailingzeroes(word);
        word = word & (word - 1);
        result[i+1] = trailingzeroes(word);
        word = word & (word - 1);
        result[i+2] = trailingzeroes(word);
        word = word & (word - 1);
        result[i+3] = trailingzeroes(word);
        word = word & (word - 1);
        i+=4;
      }
    

    The downside of this approach is that you need an extra step to count how many 1-bit there are in your words. Thankfully, it is a cheap operation that can be resolved with a single instruction on x64 processors.

    This ‘unrolled’ approach can void more than half of the mispredicted branches, at the expense of a few fast instructions. It results in a substantial reduction in the number of CPU cycles elapsed (GNU GCC 8, Skylake processor):

    cycles / 1-bit instructions / 1-bit branch misses / word
    conventional 5.5 8.3 0.93
    fast 3.8 9.8 0.40

    So we save about 1.7 cycles per 1-bit with the fast approach. Can the mispredicted branches explain this gain? There about 6 bits set per input word, so the number of mispredicted branches per 1-bit is either 0.15 or 0.065. If you multiply these fractions by 15 cycles (on the assumption that each mispredicted branch costs 15 cycles), you get 2.25 cycles and 1 cycles; or a difference of 1.25 cycles. So while my numbers do not quite match, it does seem credible that the mispredicted branches are an important factor.

    I offer my source code, it runs under Linux.

    We use this decoding approach in simdjson.

    How close are we to the optimal scenario? We are using one instruction per 1-bit to count the number of trailing zeros, one instruction to zero the least significant 1-bit, one instruction to advance a pointer where we write, one store instruction. Let us say about 5 instructions. We are getting 9.8 instructions. So we probably cannot reduce the instruction count by most than a factor of two without using a different algorithmic approach.

    Still, I expect that further gains are possible, maybe you can go faster by a factor of two or so.

    Futher reading: Parsing Gigabytes of JSON per Second and Bits to indexes in BMI2 and AVX-512.

    Credit: Joint work with Geoff Langdale. He has a blog.

    i like this art: Allora & Calzadilla

    Allora & Calzadilla

    Work from The Great Silence.

    I was fortunate enough to see The Great Silence tonight at the Contemporary Art Center as part of a film screening with the Mini Microcinema titled “Artist and Animal Pt. 1” that is running in conjunction with the soon-to-open exhibition Creatures curated by Steven Matijcio.

    If you are in Cincinnati, the Artist and Animal, Pt. 2 is screening on June 27th and presented by Valentine Umansky.

    “The Great Silence focuses on the world’s largest single aperture radio telescope, located in Esperanza (Hope), Puerto Rico, which transmits and captures radio waves to and from the farthest edges of the universe. The site of the Arecibo Observatory is also home to the last remaining wild population of critically endangered Puerto Rican Parrots, Amazona vittata. Allora & Calzadilla collaborated with science fiction author Ted Chiang on a subtitled script about the bird’s observations on humans’ search for life outside this planet, while using the concept of vocal learning—something that both parrots and humans, and few other species have in common—as a source of reflection upon acousmatic voices, ventriloquisms, and the vibrations that form the basis of speech and the universe itself.” – Impakt.nl

    Michael Geist: Does Canadian Privacy Law Matter if it Can’t be Enforced?

    It has long been an article of faith among privacy watchers that Canada features better privacy protection than the United States. While the U.S. relies on binding enforcement of privacy policies alongside limited sector-specific rules for children and video rentals, Canada’s private sector privacy law (PIPEDA or the Personal Information Protection and Electronic Documents Act), which applies broadly to all commercial activities, has received the European Union’s stamp of approval, and has a privacy commissioner charged with investigating complaints.

    Despite its strength on paper, my Globe and Mail op-ed notes the Canadian approach emphasizes rules over enforcement, which runs the risk of leaving the public woefully unprotected. PIPEDA establishes requirements to obtain consent for the collection, use and disclosure of personal information, but leaves the Privacy Commissioner of Canada with limited tools to actually enforce the law. In fact, the not-so-secret shortcoming of Canadian law is that the federal commissioner cannot order anyone to do much of anything. Instead, the office is limited to issuing non-binding findings and racing to the federal court if an organization refuses to comply with its recommendations.

    The weakness of Canadian law became evident last week when the federal and British Columbia privacy commissioners released the results of their investigation into Facebook arising from the Cambridge Analytica scandal. The report details serious privacy violations and includes several recommendations for reform, including new measures to ensure “valid and meaningful consent”, greater transparency for users, and oversight by a third-party monitor for five years.

    Facebook’s response? No thanks. The social media giant started by disputing whether the privacy commissioner even had jurisdiction over the matter. After a brief negotiation, the company simply refused to adopt the commissioners’ recommendations. As their report notes “Facebook disagreed with our findings and proposed alternative commitments, which reflected material amendments to our recommendations, in certain instances, altering the very nature of the recommendations themselves, undermining the objectives of our proposed remedies, or outright rejecting the proposed remedy.”

    The federal commissioner has indicated that he plans to take the case to the federal court, where he will be forced to start from scratch by presenting sufficient evidence that Facebook violated Canadian law. Even with a successful claim, the law provides little in the way of penalties, with a newly established maximum of $100,000 for certain violations. By contrast, Facebook said last week that it has set aside US$3 billion in anticipation of U.S. enforcement penalties that could hit US$5 billion.

    This is not the first time a major company has refused to comply with privacy commissioner recommendations. In 2015, Bell initially rejected findings associated with its relevant advertising program that would have required customers to opt-out of behavioural tracking. After an avalanche of negative publicity, it reversed its position.

    With companies seemingly free to reject privacy commissioner findings – Facebook earns more than enough revenue every sixty seconds to pay the maximum PIPEDA penalty – Canadians are left without effective privacy protection. Innovation, Science and Economic Development Minister Navdeep Bains touted changes to the law that added new penalties, but the reality is that Canadian law now badly lags behind other countries.

    The obvious solution starts with granting the Office of the Privacy Commissioner order making power and supplementing the law with penalties that would make companies think twice before ignoring PIPEDA.

    The Office of the Privacy Commissioner was admittedly slow to recognize that the effectiveness of the law depends upon serious enforcement. In 2006, Jennifer Stoddart, then the federal privacy commissioner, told a House of Commons committee that order making powers were not a priority. A year later, it took federal court ruling to push a reluctant commissioner’s office to investigate foreign entities collecting personal information from Canadians.

    Today, as global companies are on the verge of regarding Canadian privacy law as irrelevant and the European Union is increasingly likely to re-examine its decision to consider Canadian law “adequate” for the purposes of cross-border data transfers, the office has rightly become convinced that the law must be upgraded. That leaves the question of what more Mr. Bains and the government need to recognize that their vision of leadership in the digital economy is being undermined by privacy rules that leave millions of Canadians without effective and enforceable protection.

    The post Does Canadian Privacy Law Matter if it Can’t be Enforced? appeared first on Michael Geist.

    Tea Masters: Tea news

    Spring is a busy time for tea bloggers. The sun shines longer, but the articles become shorter!

    1. I have come back from Alishan with some great harvest pictures, and 2 spring teas: a Jinxuan Oolong and a Qingxin Oolong. The Qingxin was harvested on April 29th and is already available on May 2nd! For Wenshan Baozhongs and other high mountain Oolongs, it will take more time until they are available. Thanks for your patience.

    2. The Podcast I did with Ken Cohen of Talking Tea while I was in NYC has been released. You may listen to me discussing Chaxi: Harmony, Art & Expression in Tea.
    3. This may be a good time to remind my readers that if you like the content of this blog, the pictures I post on my Photo blog or Instagram, the best way to support my work is with a purchase on the tea-masters.com boutique. The teas I source in Taiwan are of great quality and my Chabu are made by hand exclusively for my boutique. And there are many benefits when you order from my boutique: free tea postcards, free samples (above 60 USD), free shipping (above 100 USD) and free eBooks about tea (starting at 60 USD orders)...
    I'm very grateful for the support I receive all around the world!

    Perlsphere: Fixing PAUSE permissions

    For the first 20 or so years, PAUSE treated indexing permissions case sensitively. So Foobar was considered a different module from foobar. This caused problems on operating systems with case insensitive filesystems like MacOS and Windows. So at previous QA Hackathons and Toolchain Summits, it was decided that PAUSE should treat indexing permissions case insensitively. In 2016, at the Rugby QA Hackathon, I started a project to resolve the historical clashes in indexing permissions.

    OUR VALUED CUSTOMERS: Piss jug blues...

    MattCha's Blog: 2016 Zheng Si Long Mang Zhi / I think I like Mang Zhi


    I quite like tasting the selection of Zheng Si Long Mang Zhipuerh at Tea Encounter.  So after receiving this sample in a complimentary care package from Curigane (Tiago) of Tea Encounter I make this 2016 Zheng Si Long at $150.85 for 400g cake ($0.38/g)the first of the bunch to sample…

    Dry leaves smell of a very sweet, kind of faint floral fruit with a mineral rock like odour.

    First infusion has a mild sweet honey and woody almost grain onset with wood bark and almost cherry tastes in the aftertaste.  The tastes are crisp and pure the mouthfeel tight.

    The second infusion has a soft onset of dry wood, grains, almost grass with a subtle fruity nuance.  A candy-like aftertaste lingers on the breath.  The mouthfeel opens gently and the mouthfeeling is slightly taught and sticky.

    The third infusion has a malty grains and woody sweetness.  There is a long sweet, date-like, malty sweetness and noticeable camphor pungency in the returning sweetness.  The aftertaste displays different layers of sweetness- malty, date, and faint candy floss.

    The fourth infusion starts off sweet and fruity with dry wood in the distance.  The pungent camphor aftertaste sets off a long sweetness the trails into the breath.  This sweetness is maltier and grain-like initially but has a candy-like nuance in the aftertaste.

    The fifth infusion starts off mainly sweet malty, grainy, almost date.  There is pungency then long sweetness.  The mouthfeel is slightly tight and slightly sticky but not overly stimulating.  The throat is mildly opened.  The profile overall is quite sweet with dry wood- very Yiwu in profile.  The Qi is mild and lingers heavy in the head, behind eyes, makes me feel pretty relaxed, almost dopy.  I’m looking for an energy boost this morning but am give only dopiness and relaxation and a heavy head.

    The sixth infusion starts off balanced between dry wood and malty, grainy sweetness.  The onset of camphor sets into motion a long sweetness.  Sticky mouthfeeling.

    The seventh infusion is a nice clean woody, malty sweet taste.  There is less sweetness now and more of a light balance between dry woods and malty sweetnesses.  The camphor returning taste is there and the sweet tastes are long, if not becoming less obvious now.  The long candy like breath taste is holding though.

    The eighth infusion has a more intense pungent camphor taste but less obvious initial taste.  The pungent camphor in this tea is very nice as is its long aftertaste.  The tastes are one dimensional but pure and flavourful.  The qi continues to relax and sedate.

    The ninth is more wood but still contains all the same flavor elements in there.  The mouthfeel is becoming more sandy and less sticky now.

    The tenth is loosening a bit of strength now, its mellowing out into a nice sweet/woody/camphor thing.  The eleventh is sweeter now, but its harder to define the sweetness.  The taste remains pure and unmuddled here but not as intense or nuanced as early on.

    I add 10 seconds to the flash infusion in the 12th and get a nice fresh, crispier woody sweetness.  In the 13th I add 20 seconds and get much the same mellow woody sweetness but the longer infusion pushes out more candy aftertaste that still lingers in these leaves.  I feel spacy and lazy.

    To lazy to do more, I do a few more longer infusions with enjoyable fruity/ woody tastes in a mild sandy/sticky mouth- and throat-feeling. 

    This puerh has a very stable infusion to infusion session but what is there is very pure and very flavourful.  Its single estate pedigree is quite obvious.  Very Yiwu tasting.  The mouthfeeling/ throatfeeling and qi are not bad too.  A very solid tea for those who like the Yiwu taste and single estate but don’t want to pay crazy for it.

    I go to compare the 2018 and 2017 Mang Zhi but I drank them up…. Too delicious to keep around too long, I guess… I’m starting to really enjoy these Zheng Si Long Mang Zhi.  My favorite out of the bunch is the 2017.

    Peace

    CreativeApplications.Net: HALO – Science of nature through the eyes and ears of a technological sublime

    HALO – Science of nature through the eyes and ears of a technological sublime
    HALO is a large scale immersive artwork which embodies Semiconductor’s ongoing fascination with how we experience the materiality of nature through the lens of science and technology.

    Michael Geist: Myths and Reality About Canadian Copyright Law, Fair Dealing and Educational Copying

    Seeking to debunk many of the misleading claims on the state of Canadian copyright, fair dealing and education, I’m grateful that InfoJustice.org has published my post on the myths and realities of the current situation. The post relies on actual data presented at the recent copyright review to demonstrate how the Canadian market has experienced increased spending on licensing, e-book licensing has been a central part of the education licensing strategy, and educational institutions are paying for licences even when they retain collective licences.

    The introduction notes:

    Schools and universities are shifting to the use of digital resources – including to online E-reserves, E-Books and other forms of digital distribution. Collective (blanket) licensing, which for years has charged schools for making analogue reproductions of excerpts of printed works for use in printed course packs has declined in value and usefulness as education invests in digital licensing that offers enhanced access and reproduction rights. To facilitate the shift that benefits all stakeholders, legal rules must reflect emerging practices in which blanket licences compete in the market with alternative licensing models. One answer, represented by Canada, is a mix of broader copyright exceptions for the use of excerpts for educational purposes combined with a shift in educational spending toward buying and licensing more digital works and digital uses of works. The result is that educational spending on licensing in Canada has increased with exceptions and licences co-existing in a manner that provides appropriate compensation for authors and publishers alongside increased access and flexibility for educational uses.

    The full post can be found here.

    The post Myths and Reality About Canadian Copyright Law, Fair Dealing and Educational Copying appeared first on Michael Geist.

    MattCha's Blog: 2018 Black Friday Puerh Purchase Report


    I thought it would be a good idea to go over all my 2018 Black Friday purchases and free samples.  This year I put myself on a budget of $500.00 for Black Friday and did a good job of sticking to it.  I was also swayed by the sales offered by vendors and went out of the sale not exactly how I intended going in.  However, the puerh which I purchased was all very good and good value.  Many of which are now sold out.

    white2tea

    I went into Black Friday to purchase a cake I’ve read a lot about over the last few years, this 200g cake of 2016 white2tea We Go High.  I was also keen to test the waters with the bottom of white2tea’s brand in trying to snag some promotional cakes to compete in a search for the cheapest fresh young sheng… if that is how it was going to go down… and it did. In the end I ended up with some free samples and a free tote.


    Complex fruity layers, interesting range of notes, nice stoner qi, good stamina, not sure about value still but liked it enough to purchase another cake.  Nice raw material.  I enjoy both the stamina and Qi in this one.  Price went up just a nudge to $152.00 for 200g cake or $0.76/g.


    Decent mouthfeel that hold complex blend of interesting notes together nicely, good diversity of tastes throughout the session, good stamina, depth of taste is lacking, tea liquor is a bit thin, still quite a good  value at the price.  Sold out in a few hours.  Not sure if I would go for it next year, if offered.


    Smooth grainy sweet taste, wheat, mild coco, mild cognac taste, tight mouthfeeling pulling throat, wood.  There are better shu at white2tea.


    Fresh bamboo suggestions, toasted grains, deep milky coco shu taste with nice coolness in the breath, nice strong and alerting Qi.  Rich and Velvetly mouth- and throat-feel.  Nicest bamboo puerh I’ve tired.

    Overall some nice teas put out by Paul at white2tea… great job!

    The Essence of Tea

    I noticed that David and Yingxi re-stocked some Malaysian stored 2003 Hong Kong Henry Conscientious Prescription in the lead up to Black Friday.  I missed out on these when he offered them a year before and was hoping they would stay in stock long enough for any Black Friday promotions… and they did!

    I was going into Black Friday thinking that I would pick up 2 cakes but only ended up going with one of these and a mini cake to get me to the free shipping.  I was offered 20% free tea promotion which I spent on the 2010 Da Xue Shan Wild 1KG Brick and ended up landing a free sample as well, sweet deal!


    Clean/pure more humid Malaysian stored, interesting sour fruit and pine wood taste, simulating mild gripping throatfeeling with solid menthol taste.  Slight thinner liquor, with mildly dizzing and somewhat heavy headed Qi sensation, simple in some ways but delicious, easy drinking aged puerh.  Not sure if I will pursue another cake.


    Nice fresh drink now spring Wuliang area puerh, very green tea-like production that is full of light refreshing puerh energy, easy to drink enjoyable value puerh.  Slight astringent if pushed, nice relaxing fluffy qi.  I’ve almost drank up the bing!


    Excellent wild tea.  Heart pounding/ Spaced out Qi.  Sweet fruity/ juicy taste is quite strong and long with the throat capturing the saliva minutes later into a long dense sweet aftertaste and breath taste.  One of the best wild’s I’ve tired.


    Vibrant, intense, freshly aromatic, with a smooth complex fruity body, deep rich unfolding hong cha profile.  Very interesting hong cha from very good raw material.

    These Essence of Tea purchases are all above mark, they are all very different and I really enjoy all of them.  When I purchase from Essence of Tea, this is what I usually expect… and they usually deliver.

    Yunnan Sourcing

    I wasn’t planning on purchasing from Yunnan Sourcing but Scott’s 15% off sale lured me in.  I ended up buying a stack of the now sold out 2017 Yunnan Sourcing Impression.


    Very nice blend of Autumn and Spring puerh.  Flavours early on are great- very fragrant, juicy tropical fruit, creamy sweetness, some other vegetal notes.  Qi has an alerting yet dopy effect as is decently strong.  Poor stamina but overall a real winner for the price.  Will there ever be a blend this good and cheap again?

    Overall, very happy with all of these Black Friday buys.  They are a decent enough reflection of the puerh I enjoy with an emphasis on cheaper/value puerh.  Most of the teas I went for are sold out so the purchase was a timely one.

    Peace

     

    OUR VALUED CUSTOMERS: To his friend...

    CreativeApplications.Net: Exercises in visual audio and deviant electronics: CAN to host twelve workshops at Mapping Festival 2019 (May 23-26)

    Exercises in visual audio and deviant electronics: CAN to host twelve workshops at Mapping Festival 2019 (May 23-26)
    Learn how to prototype post-screen interfaces, examine network infrastructures, hack museums, and transplant scents with leading artists, designers, and researchers at this year’s Mapping Festival.

    CreativeApplications.Net: Electronium – Raymond Scott’s instantaneous performance-composition machine reimagined by Yuri Suzuki

    Electronium – Raymond Scott’s instantaneous performance-composition machine reimagined by Yuri Suzuki
    Yuri Suzuki and Pentagram in collaboration with Counterpoint have re-imagined the Electronium - Raymond Scott’s instantaneous performance-composition machine.

    OCaml Weekly News: OCaml Weekly News, 30 Apr 2019

    1. FYI: Writing Network Drivers in High-Level Languages
    2. orewa 0.1.0, a redis binding (initial release)
    3. Unicode support improvements in zed, lambda-term and utop
    4. initial release of ppx_cstubs
    5. Dune, cross compilation, and Raspberry Pi
    6. Searching for functions
    7. Other OCaml News

    Michael Geist: The LawBytes Podcast, Episode 9: The CRTC Watcher – A Conversation with FRPC’s Monica Auer

    Many Canadians follow telecommunications and broadcast issues at the CRTC from a distance – the cost of wireless services, the speed of their Internet access, the availability of broadcasting choice. Others engage more closely on issues such as net neutrality, Cancon regulation, or Netflix taxes. But there is one Canadian who doesn’t just follow the CRTC.  She watches it through the use of access to information laws that present a perspective on the CRTC that would otherwise remain hidden from view. Monica Auer, the Executive Director of the Forum for Research and Policy in Communications, joins the podcast this week to talk about insider access, slow reimbursement of costs for public interest groups, the number of CRTC meetings, and the Commission’s seeming indifference to commissioning original research. The interview is interspersed with comments from current CRTC Ian Scott taken from one of his first public speeches after being named chair in 2017.

    The podcast can be downloaded here and is embedded below. The transcript is posted at the bottom of this post or can be accessed here. Subscribe to the podcast via Apple Podcast, Google Play, Spotify or the RSS feed. Updates on the podcast on Twitter at @Lawbytespod.

    Episode Notes:

    FRPC Policy 3.0 Conference (registration)

    Credits:

    CPAC, CRTC Chair Ian Scott Speaks About Internet Neutrality
    FairPlay Canada Urges CRTC Action Against Online Theft

    Transcript:

    LawBytes Podcast, Episode9.mp3 | Convert audio-to-text with the best AI technology by Sonix.ai

    Michael Geist:
    This is LawBytes a podcast with Michael Geist.

    Ian Scott:
    This organization has a long tradition of being at the forefront of regulatory change. The many hundreds of men and women who have worked here since 1968 have adopted thoughtful creative, made in Canada approaches to deal with a vast array of complex regulatory challenges.

    Michael Geist:
    Many Canadians follow telecommunications and broadcast issues at the CRTC from a distance – the cost of wireless services the speed of their Internet access, the availability of broadcasting choice. Others engage more closely on issues such as net neutrality, Cancon regulation or Netflix taxes. But there’s one Canadian who doesn’t just follow the CRTC. She watches it through the use of Access to Information laws that present an insider perspective on the commission that would otherwise remain hidden from view. Monica Auer is the executive director of the Forum for Research and Policy in Communications. She’s been an analyst at the CRTC worked, for what was then Industry Canada obtained two law degrees worked at a major law firm and now heads up an organization that may not be widely known but has had a big impact on our understanding of what takes place behind the scenes at Canada’s telecom and broadcast regulator. This Law Bytes podcast episode features a recent conversation with Monica about insider access, slow reimbursement of costs for public interest groups,the number of CRTC meetings and the commission’s seeming indifference to commissioning original research. The interview is interspersed with comments from current CRTC chair Ian Scott, taken from one of his first public speeches after being named chair in 2017.

    Michael Geist:
    Not a lot of people are necessarily familiar with the forum. Maybe you can take a moment to introduce yourself and the forum.

    Monica Auer:
    The forum is a non-profit non soliciting federally incorporated organization which was established in late 2013 to undertake primarily empirical research and policy analysis with respect to broadcasting and telecommunications. Although there are other organizations out there that have also done this a number of them are are focused on a little bit more on the law and we’re trying to bring in more of the empirical evidence.

    Michael Geist:
    So who’s behind the forum.

    Monica Auer:
    I have a board of directors who are a wonderful theory are very experienced primarily in broadcasting but also some telecom. We have somebody who was formerly with the auditor general’s office so they’re very interested always in our budgets and the board decides whether or not we should participate in specific proceedings whether they’re before Parliament or the CRTC.

    Michael Geist:
    You’ve assumed this position as being the CRTC watcher. OK. You are the person, the forum is the place that surely launches a significant number of the access to information requests and is one of the few groups and you’re one of the few people I think that have the depth of knowledge of both the CRTC and the field to be able to take a look at some of the results. When you’re looking at what’s actually coming out of the CRTC and be able to understand and interpret them that’s I think he gets a really rare thing and it’s it’s a pretty amazing contribution because when you when you take a look at what’s on your site and the myriad of requests that you’ve launched over and over now many years it tells some pretty interesting stories that don’t typically capture people’s attention and but they think they should because they tell a really interesting story about the CRTC. So I was hoping to drill down a little bit on some of those. Starting I think with fair play.

    FairPlay:
    FairPlay Canada wants the CRTC to modernize the tools we use to protect Canada’s economy from online piracy we are proposing a tool similar to that used in dozens of other countries which empowers the CRTC to identify illegal piracy sites and disable them in Canada.

    Michael Geist:
    This was for those that aren’t familiar an issue that I was actively involved with as well Web site. A web web site blocking proposal led by coalition of groups. But it was particularly Bell it was was very active on it. You submitted on it I submitted on it. Thousands of Canadians actually submitted on it but the aspect that I thought we could focus on was an access to information request that you filed a little bit after the deadline or at least came back a little after the submission deadline.

    Monica Auer:
    I think it was a little bit after the deadline because we were all racing to meet the deadline and it was triggered by the fact that many people were reading the application and we were somewhat perplexed because we know many of the players involved many of the authors involved and they’re extremely competent they really know their business. And yet this application seemed a little different from the rest. It took an unusual approach to the entire issue which was a serious one. We actually did a survey of Canadians to find out what they thought about the idea of somebody like the CRTC being able to block their access and they were very concerned the majority the overwhelming majority was concerned.

    Ian Scott:
    As companies continue to innovate in their offerings to Canadians the CRTC will continue to ensure that Canada’s Internet neutrality provisions are respected. Just as the Transport Commission before us get railway lines open for any and all users. The CRTC has set a clear tone in its regulation of content delivery. The owners and operators of the country’s communication systems may not discriminate against content based on its origin or its destination.

    Monica Auer:
    So the more I thought about it the more I wondered about how the application came about and I wondered if there had been any chance that there might have been some discussion at the commission before the application itself was filed on January 29, 2018. And that’s why I filed the the Access to Information request.

    Michael Geist:
    And what did you find.

    Monica Auer:
    I got back two sets of information two packages of information setting out a number of emails and presentations and exchanges that appear to involve the CRTC and non CRTC personnel and I looked at it and I thought my goodness this is this is unusual I had not expected actually to see that in fact there had been meetings before the application is filed. I was unprepared for that I was also unprepared for the idea of what appeared to be a fairly extensive ex-parte communications if you will. Now it’s true ex-parte normally refers to a process something that’s already launched and then the parties discuss with the adjudicator outside of the other parties range. But in this case the CRTC is always in a position of judging things. So in that sense it might be considered ex parte. In any event I looked at that and I thought this is this is an interesting set of documents and I actually didn’t have time to look at. And that’s why I sent everything to you. I didn’t I wasn’t quite sure what could be done with that and the fact of the matter is that although we do have some people interested in our website and it does have some some kind of some fun things on it. I wanted to make sure that people would have as much access as possible to this because it speaks to the nature of informal contact between administrative authorities and society and those they regulate.

    Ian Scott:
    When any regulator the CRTC included performs its work in the name of the public interest. It must balance various points of view some of which obviously conflict. It’s not always appropriate to lean on one side of the fence or another in the interests of corporations or in defence of what the average Canadian needs or wants the public interest is much more dynamic than that.

    Michael Geist:
    The CRTC’s reaction was and Bell’s reaction for that matter was that there was nothing to see here. What are your thoughts.

    Monica Auer:
    I think there was something to see because if there were nothing to see there wouldn’t have been any documents. I think that there was something to see because being told that there is nothing to see is usually a pretty good clue that there actually is something to see and the fact of the matter is that everybody who read the documents knows there was something to see. So for all three reasons I can appreciate the perspective of those who didn’t want to have those things seen. But that’s not the case.

    Michael Geist:
    And do you think there’s something wrong with Bell being able to call up a CRTC Commissioner seeking the opportunity to present this – as I ask this question and you’ve got a big smile on my face because I feel a little rhetorical – but you know your thoughts on the power that companies have when they can quite literally in this case call up a CRTC commissioner ask for the ability to present something and then ultimately have the ability to meet with staff months before anything ever gets filed.

    Monica Auer:
    I guess the question is would anybody mind as long as the CRTC were equally open to all parties. Except that I’m aware of a number of non licensee and non telecom companies that call and they can’t get telephone calls returned from CRTC commissioners. So it seems to me when that’s a repeated pattern of behaviour there may not be so much of a level playing field. Secondly there’s no record of that as a general there’s certainly no public record of telephone conversations. And of course even though we can go through the commissioner of lobbyist registry and I think at one point I either have a blog in progress or a plan or a great deal material. The fact of the matter is is that the number of visits made by those who are beholden to the CRTC as licensees or those who are regulated meet regularly they meet weekly with the commissioners.

    Monica Auer:
    There are no records of those meetings in the sense of you know they’re not recorded. Do we want them recorded do we believe that it’s better for those who are making these decisions to be to be informed behind closed doors do we. Are we concerned that there might be an unusual quid pro quo. Just do this thing here and we’ll take care of that thing there. We don’t know and I think it places the commission in a very awkward position.

    Michael Geist:
    I think you’re right that that lack of balance or the uneven playing field I suppose may be part of the reason that we’ve got a cost awards system in Canada that’s designed to better facilitate the ability for civil society groups and other groups that are concerned with these issues to participate.

    Ian Scott:
    We invite indeed encourage stakeholders from across industry the government and the public at large to share their comments and opinions on the issues before us and we do so with a view to building as complete a public record as possible.

    Michael Geist:
    Can you start I guess by describing a little bit what that cost award system looks like and then I want to get into a little bit with some of your research has shown in recent years about how long it takes for certain groups to be paid off with those cost awarsds.

    Monica Auer:
    I think the cost awards process was initially effectively launched by the Public Interest Advocacy Centre. They participated in one of the first major telecommunications proceedings in the late 1970s. And for those of your listeners who may not know or those who are in the audience who may not know telecommunications was not always federally regulated. It was until the mid 1970s were like regulated in part by the provinces. And so it was only in the late 1970s that the CRTC itself had jurisdiction and then began to exercise control over it.

    Monica Auer:
    And so the question arose is how do you ensure that you you don’t just have the parties and then a number of interested members of the public who lack the ability to participate effectively. The issue is not that everybody can’t write the CRTC a letter. The issue is that you have a company like Bell who may have two or three hundred people working in their regulatory affairs department. And to put that in context the CRTC in its entirety has maybe 400 people not half of whom were devoted just to telecom. So what to do.

    Monica Auer:
    Even if you just assume that the CRTC could could address the public interest on behalf of the public interest there is no mandate within the CRTC act within the telecommunications act nor within the broadcasting act for the CRTC to put the public interest first. Secondly in a way the commission would be conflicted out. It would be like telling a judge well keep in mind what that person there are the defendant has to say and you look after the defendant’s interest. And now we want you to rule fairly. The other side will always have a concern.

    Monica Auer:
    So there was a cost process set up in the late late 1970s and it was 1979 through the through a draft rules of procedure noticed by the CRTC. And of course the very first cost award was taken to court went all the way to the Supreme Court. And fortunately the Supreme Court recognized the benefit of having non company participants involved. So this proceeded for many years in a in in an unusual kind of way. The CRTC commissioners would appoint a member of their staff to be a costs officer who would then adjudicate costs claims that became very time consuming and because it started to take so much time they decided to streamline the process and make it very fast so you could just apply and the commission staff would go over the application to make sure that it was appropriate.

    Monica Auer:
    So that was where that’s where the telecommunications process is right now. Applicants who appear who have met the terms of the CRTC’s cost policies apply to have some of their costs not all of them. Some of them reimbursed the broadcasting side. It’s somewhat different. The CRTC set up as a parallel organization but you don’t apply to the CRTC you apply to another party. The issue there is at the CRTC allocated money to this agency and it’s out of money. So in terms of broadcasting by next year perhaps there may be no money for public interest groups to participate.

    Michael Geist:
    The historical background is really valuable. It’s a system that least on paper seems like a really good. It seems like a great idea. Good to see the Supreme Court thought so too. You’ve been looking into the speed with which these cost awards get paid out because of course what happens based on what you’ve just described is that the public interest group will incur various costs of participation that could include retaining experts to provide further evidence to the CRTC and after it’s done as you mentioned you submit your bill and the CRTC reviews. What did your research find about how long it’s taking in a sense to get paid back for those expenses.

    Monica Auer:
    The average is nine months and what an average means is that there are some cost claims that are taking two years. Some are taking three years not many but some. To put that into context when the CRTC when I when I I looked at the cost awards from 2013 through to 2018. So that was 182 of them just to look at the time of filing for instance and then the actual date of the decision. And in 2013 for the entire year it took a little over three and a half months for the CRTC to receive the application. think about it and issue a decision. Now it takes nine and a half months.

    Michael Geist:
    Okay so I was pretty quick. Just a few years ago.

    Monica Auer:
    Yes.

    Michael Geist:
    Talking nearly a tripling based on what you just said on average do we have any idea why is the CRTC said anything about why it’s taking so much longer to process this because now we’re talking about organizations that can wait a year or two as you say perhaps even three years to get paid back for money they’ve spent on many of these organizations are not deep pocketed organizations that can in a sense afford the float of having all these accounts receivable outstanding.

    Monica Auer:
    Yes. And just to address that before I address the other point some people have have suggested well why didn’t you just save money so that from one year to the next you can go. But the fact is that if if I retain an economics expert the economic expert needs to be paid I can’t take that person’s money and then hold it for the next one. It’s not a Ponzi scheme it’s as you have a bill you have to pay it.

    Monica Auer:
    So to think about why it’s taking so long. I have noticed myself that with respect to the forum’s applications and bearing in mind we started making. I think we made our first application in 2014 and the first one sailed through and sailed through in the sense that nobody objected because the Commission has adopted a practice where they take every application that we make or that any party makes in fact and then they send it to the telcos and telcos have the right to reply. And I find that the certainly the depth and length of the replies has gotten longer and the arguments made are not changing. In other words every time it seems we while you’re not a lawyer. Actually yes I am. You’re not doing legal research. Actually yes I am. You didn’t incur those costs. Well the invoices are there you know. Yes we did.

    Monica Auer:
    And then while you didn’t contribute anything to the process well that’s up to the CRTC and in fact this is what we did. We did a survey we did economic research. We did charts. We did graphs. We did 150 pages with you know four gazillion footnotes. Actually we did. We think we contributed. So it’s up to the CRTC. And so it would be interesting as well if you remember that the costs that were paid haven’t changed since 2010. In other words the rates that we may charge have not increased in the last nine years and that’s not an issue except when people might think well you’re just going to pad your bill to make. Well we can’t because you know that darn ethical code to which you swear when you become a lawyer. They they require you to follow the rules. You can’t just pad, you live with it and that’s fine. It’s just then you have to carry the bills. And it’s not even you know. Okay. So I own my own house. I don’t have to pay rent. That’s terrific. But you know other people do and other people have office rents. And when we hire a survey company do they really want to wait a year to get paid. That’s so. So how do you ensure that you can have a strong public presence on a level playing field when we don’t have the resources and we can’t. And it’s ultimately we can’t get paid. It’s just I think the commission used to be able to do it quickly. It’s unclear to a number of us why it’s still taking so long.

    Michael Geist:
    So that’s unclear. One of the other things that you’ve been focusing on is how frequently the commission meets. I don’t know if it’s I don’t know if there’s a connection or not. But that piece of research which is quite recent is extensive in taking a look at just the number of meetings that the CRTC is having maybe you can just start by introducing what it is that you asked for through the access to information system and what did you get back?

    Ian Scott:
    The job before us as commissioners is to weigh whether times can be contrasting ideas. On the one hand business has its own interests to present and defend it must answer to its shareholders and maximize returns on investment which is entirely appropriate. It’s not business’s duty to always promote the public interest.

    Monica Auer:
    I thought that when I asked how often is the CRTC meeting I would get back maybe an Excel spreadsheet with just the dates of the meetings. That’s what I thought I would get and then I would just play with that and I say oh look you know they’re meeting the same and more or less. And the reason I asked was because I was wondering could it possibly be that they’re not meeting as much these days and that might be slowing things up and I should preface this by saying that when I worked at the commission, the full commission was regularly in the building every month that seemed to me or every second month there would be a meeting of the entire commission which at that point was not just the ninth night not just the 13th I think was 13 full commissioners but then another up to 15 part time commissioners a lot of people. All right. So I asked the commission for a list of the meetings, the dates on which the commissioners met, and the agendas of the meetings and I thought you know one page agenda. Right.

    Monica Auer:
    So I received I received a scanned PDF of each month of the year from January 2007 to December 2018 and on many of the days there was a little notation as to what the meeting might have been about. And sometimes it was telling you that there would be no power in the CRTC’s buildings over the weekend which is a regular occurrence. And I found out of the hundred and fifty one pages of calendars once they input all of the data I was looking at 3000 or so meetings from 2007 to 2013 and the interest the most interesting thing I was surprised that the total number of meetings of the CRTC whether the full commission its committee telecom committee any of those panels had decreased by a third over the period.

    Michael Geist:
    Okay so that the very time when these issues are becoming more front and center it feels like there’s almost a continuous cycle of hearings and issues and re-examination of both telecommunications related issues, wireless issues, broadcast issues. The commission is meeting less as opposed to more.

    Monica Auer:
    It is meeting less and it is often meeting without any agendas. Roughly a quarter of the meetings happen without an agenda and and a a cynic might say well there’s no agenda because then you can’t ask for the agenda and access to information you can’t get what isn’t there. But I think the more interesting question is what do commissioners if they’re on this mailing list and they’re invited to attend a meeting. What do commissioners do who don’t have a copy of an agenda? Do they have any documents to go with it? Is the commission not thinking that a document that sets out a problem is part of the agenda? I don’t know what it is I’ve asked. The agendas didn’t accompany the calendar pages that is going to take somewhat more time and I’ll see what we get then. As for the the meetings themselves I think the other thing is that when we think of the meeting I was very broad and apparently the commission was too because what they also sent me showed that roughly two thirds of these meetings are done by email. And when you and when you look at what you counted as a meeting I don’t I don’t know.

    Michael Geist:
    An email exchange is a meeting?

    Monica Auer:
    I don’t know it would say for instance the notation might be BCMEM broadcast committee meeting email and this was on their list their calendar page showing all of these meetings but and even for the email meetings they decreased in number, the in-person meetings decreased in number, and the email meetings decreased in number over time that’s it.

    Michael Geist:
    It’s counterintuitive to one what one might think. The other thing that’s somewhat counterintuitive when you take a look at the work you’ve been doing has to do with the number of requests that once again there is no information. So you mentioned there were no agendas in this instance you’ve asked for some really interesting things that one would have thought the commission might have spent some time digging into and consistent with the approach the forum has about where’s the evidence and trying to bring forward the evidence.

    Ian Scott:
    When this commission makes its decisions it does so based only on the facts it has at its disposal. We depend on the public record to inform the choices we make. And when that record is fully developed and rich with information that the decisions we take are strongest and most supportive of the public interest.

    Michael Geist:
    I was taking a look at the list I know that you’ve asked about things like how many journalists jobs exist in the country, the impact of foreign investment rules with respect to broadcast, the number of news bureaus in radios radio stations and television stations, studies on any number of different issues that the CRTC has been engaged in diversity of voices balance and news deregulation in advertising and what the common link between all of these issues and requests is at the CRTC said they had absolutely nothing.

    Monica Auer:
    No records.

    Michael Geist:
    So just for those that don’t regularly file access to information requests, a no record means that the CRTC says…

    Monica Auer:
    They they have no information. And when I make an access to information request I try I try for everything. I thought it was unfair to the commission just say Hey I’d like any records about. So I thought I would clarify it. Well perhaps you might have written a memo or an email or you might have had a research study or or a briefing or a presentation. I tried to come up with as many synonyms as I could. And those would count as records under Access to Information case law. And I think they’re all surprising in that you know the very first section of the Broadcasting Act for Canada’s broadcast. Parliament’s very first statement in its broadcasting policy for Canada is that the Canadian broadcasting system shall be effectively owned and controlled by Canadians. And yet when I asked do you have any information on voting shares. I say well we don’t collect it like that. If you want to go to each of the ownership charts that we maintain on our website you can see the non Canadian voting shares there. But those charts are only current. They’re not historical. So if I want to see if there’s any kind of change over time the level of foreign ownership. I can’t. Nor does it easily represent equity which can be a form of debt. It can be a loan we don’t know what it is but it’s not a voting share can be a non-voting share. No information as well and an overall I think a number of people like Dr. Winseck for instance at Carlton have pointed out that Canada has one of the highest levels of concentrated ownership in the communications sector in the world. So it was puzzling to me that the commission has no research on this hasn’t undertaken any studies hasn’t hasn’t commissioned any studies has no information on the impact of concentrated media ownership either in broadcasting it’s bizarre to me.

    Michael Geist:
    So what do you think that says about the CRTC I know for example that the broadcasting and telecommunications legislative review panel which is ongoing and for which we should get at least an interim report by the summer has itself commissioned a series of different research reports that was the subject of someone else’s access to information requests. So it recognized as it embarked on its review that generating more evidence and conducting some studies was a natural byproduct of being a natural byproduct being actively involved in a comprehensive study and the CRTC would appear based on the requests you’ve made very often seems content to just rely on what it hears from people who submit as part of hearings. Is that what we’re led to conclude?

    Monica Auer:
    I think it’s not just a conclusion it’s the fact. You’ll recall that we had the basic service obligation proceeding about four years ago five years ago. I was invited along with many other people to the lockup and we were you know if we wanted to ask questions of the senior staff we could and so I said you know you’ve said X here referring to evidence. Could you tell me what research the commission itself undertook or bought about the telecommunications sector for this enormously important review. They said well we didn’t do anything. We had the evidence before us.

    Monica Auer:
    Well who can afford to do the wonderful types of evidence that they need a good economic study because economists are valuable people and their time is valuable could cost you thirty thousand dollars forty thousand dollars to put it in context a good survey if you can’t analyze survey research results yourself and I am able to because of that otherwise not necessarily very useful undergraduate and graduate degree in political science although I really liked my alma mater there was an excellent university Carlton. A good survey research study might cost with this with the survey included forty thousand dollars. Who has that money. The public interest side and if the commission isn’t undertaking it are they actually relying on public interest groups to come up with these data. And if they are then again the cost order process is not assisting in ensuring that we we can do this. Should the commission be doing more? I think yes. The Commission not only should but of course it could. It has enormously well qualified people. Its rules of procedure specifically state if you’re going to intervene in our proceedings please give us evidence.

    Monica Auer:
    All right then I think the commission has a duty itself to ensure if it wants to meet its own mandate, if it wants to ensure that it is actually acting properly as Parliament’s delegate I think it has a duty to explain how that’s happening.

    Michael Geist:
    So that there is a real stream continuity with many of these requests and the findings really really identify a lack of evidence that at a certain level relies upon public interest groups to generate at least a counterbalance to what the commission might hear from more commission by some of the established players and yet there’s challenges to get paid back for some of that. There’s questions about how much the commission is actually meeting on any of these issues. In any event and when they do meet it turns out some of the meetings that they have are these off the record meetings such as the one that we saw in Fairplay raised a lot of concerns. I assume that some of these are the kinds of issues that may be addressed in a forthcoming conference that you’re putting on that examines policy and communications law in Canada. Can you tell us a bit about that.

    Monica Auer:
    Thank you for allowing me to make my little pitch. We’re holding a conference at the University of Ottawa itself here May 10th May 11th. The focus of the conference is with the proposals that have been made to amend Canada’s broadcasting and telecommunication statutes.

    Monica Auer:
    As you as you know not many of these submissions that were made to the legislative review panel on January 11th have been made public. We’ve we have a number of people have kindly mentioned to us or they’ve given us consent to post their interventions and so we have I think 57 at this point on our website. And so as we were as I was beginning to look at those and as I was discussing with my board it became clearer that more and more people were interested in having at least an opportunity to discuss what these proposals are and what the implications are and what the new statutes if there are any ought to look like. So we’ve invited a number of interesting people who are going to come and deal with a number of different aspects of of broadcasting and telecommunications in the in the sense of the rights and responsibilities. There are a lot of rights in in both the Telecom Act and the Broadcasting Act and there are also responsibilities both from the corporate perspective and from the audience perspective. It’s not enough just to complain about things you have to be able to prove things and so we’re looking at things like of course whether or not you know we have a duty to to fund and to properly operate a national public content provider which today we call a CBC whether we have actually a right to exercise control over our own communication systems within our borders given the Internet what do we do. Are we going to exercise jurisdiction over Google Facebook Youtube what do we do with that. Or is the era of national control over our borders effectively gone.

    Monica Auer:
    Do we have the right in an era of fear not so much fake news but news that can be manipulated. Do we have the right to ensure that people can be informed so that they can exercise their democratic credit franchise. And so we have lots of fun people for instance the former chair of the commission Konrad from Finkelstein has kindly agreed to come we have Bram Abramson and formerly of TekSavvy, Janet Lo currently with TekSavvy, we have Brad Danks from OutTV who is an amazing thinker for a lawyer. He’s actually taken a long-term perspective as to what the future of the system might be and I think his company is very much engaged in and working with the new realities they have. We have Tim Denton, Suzanne Lamarre both former CRTC commissioners John Langford from PIAC. We would have invited you except that you won’t be in Canada that time. And so in your stead we’ve invited Dr. Dwayne Winseck because to some extent he has a similar viewpoint in that he is not necessarily as as bound to the Broadcasting Act as I am. And so we’re hoping that he will stimulate conversation with people from ACTRA, DTGC. It should be a lot of fun.

    Michael Geist:
    Well thanks so much for joining us on the podcast.

    Michael Geist:
    Thank you very much for inviting me.

    Michael Geist:
    Policy 3.0. The FRPC upcoming Communications Conference takes place at the University of Ottawa on May 10th and 11th. There is a link to the conference registration page at Michael Geist.ca.

    Michael Geist:
    That’s the Law Bytes podcast for this week. If you have comments suggestions or other feedback, write to lawbytes.com. That’s L.A. W B Y TE S at P.O. Box dot com. Follow the podcast on Twitter at @lawbytespod or Michael Geist at @mgeist. You can download the latest episodes from my Web site at Michaelgeist.ca or subscribe via RSS, at Apple podcast, Google, or Spotify. The LawBytes Podcast is produced by Gerardo LeBron Laboy. Music by the Laboy brothers: Gerardo and Jose LeBron Laboy. Credit information for the clips featured in this podcast can be found in the show notes for this episode at Michaelgeist.ca. I’m Michael Geist. Thanks for listening and see you next time.

    Transcribe audio to text with Sonix, the best online audio transcription software

    Sonix quickly and accurately transcribed the audio file, “LawBytes Podcast, Episode9.mp3”, using cutting-edge AI. Get a near-perfect transcript in minutes, not hours or days when you use Sonix, the industry’s fastest audio-to-text converter. Signing up for a free trial is easy.

    Convert mp3 to text with Sonix

    For audio files (such as “LawBytes Podcast, Episode9.mp3”), thousands of researchers and podcasters use Sonix as a simple software solution to transcribe, edit, and publish their audio files. Easily convert your mp3 file to text or docx to make your media content more accessible to listeners.

    Best audio transcription software: Sonix

    Researching what is “the best transcription software” can be a little overwhelming. There are a lot of different solutions. If you are looking for a great way to convert mp3 to text , we think that you should try Sonix. They use the latest AI technology to transcribe your audio and are one my

    The post The LawBytes Podcast, Episode 9: The CRTC Watcher – A Conversation with FRPC’s Monica Auer appeared first on Michael Geist.

    Planet Lisp: Lispers.de: Lisp-Meetup in Hamburg on Monday, 6th May 2019

    We meet at Ristorante Opera, Dammtorstraße 7, Hamburg, starting around 19:00 CET on 6th May 2019.

    Christian was at ELS and will report, and we will talk about our attempts at a little informal language benchmark.

    This is an informal gathering of Lispers of all experience levels.

    Update: the fine folks from stk-hamburg.de will be there and talk about their Lisp-based work!

    i like this art: Lothar Hempel

    Lothar Hempel

    Work from Working Girl.

    “Tess McGill is an Irish-American working-class stockbroker’s secretary from Staten Island with a bachelor’s degree in Business from evening classes. She aspires to reach an executive position. Tricked by her boss into a date with his lascivious colleague, she gets into trouble by publicly insulting him and is reassigned as secretary to a new financial executive, Katharine Parker.

    Emma Stern: “The works you are exhibiting this November in Dusseldorf are full of pictures of women that are known from movies. How should one behave opposite these faces and figures?”

    Lothar Hempel: “I think movies are about setting bodies in motion in unusual ways. In my sculptures and paintings, the bodies are freed from their context and stopped in their movement. They stand there without their original motivation. Commercial interest in them is extinguished, desire is silenced. You could say they are exposed, naked and empty. They are in a mysterious condition which I like to call nature. I can look at them, like a landscape. Then they belong to me, becoming inner figures. That way I can reinstall them, as a possible projection plane and the starting point of a new story, a new orbit.”

    Seemingly supportive, Katharine encourages Tess to share ideas. Tess suggests that a client, Trask Industries, should invest in radio to gain a foothold
    in media. Katharine listens to the idea and says she’ll pass it through some people. Later, she says the idea wasn’t well received.

    Emma Stern: “I don’t know if I can see that in the same way. For me, these figures are not freed, but cut or torn. You tend to emphasize the outline which in your work is always sharp, almost threatening. Lately, body elements that are usually on the inside are cut out, a lung or a heart, for example. At other times, words and text fragments overlap and dissect the bodies. What is happening there?”

    Lothar Hempel: “The outline for me is actually an almost obsessive moment, to become blatantly clear, I cannot get enough of it. The outline is the threshold at which an image ends and the so-called reality begins – in my sculptures this becomes particularly apparent. Of course there is this insane desire that the image never ends and instead reality is occupied or eradicated. Then, the outline is a circle or a tilted eight, the infinity sign. But this would be the end, standstill and death. So I have to keep working against myself. Things have to stay in motion, only then sense arises.

    Lately, a kind of x-ray vision into the figures developed increasingly. I want to not only retrieve the inner workings (Interiors), but also make audible the thoughts. The interior and the exterior must be made to speak simultaneously, so that images come to life. Because the individual work that I create may be a manifestation, but eventually it is also just a stopover on a larger cycle that moves further and that I want to be part of. I can give the pictures a direction, but at a certain point they develop a life of their own and you have to let them go.”

    But when Katharine breaks her leg skiing in Europe, she asks Tess to house-sit. While at Katharine’s place, Tess discovers some meeting notes where Katharine plans to pass off the merger idea as her own. At home, Tess finds her boyfriend in bed with another woman. Disillusioned, she returns to Katharine’s apartment and begins her transformation.

    Emma Stern: “Why the title Working Girl? Is that you?”

    Lothar Hempel: “So many things resonate with this title – a romantic notion of working class liberation and emancipation. I like the sentimental and optimistic tone it has. Unfortunately I don’t think I am that myself, I feel too alienated for that, but perhaps it expresses a kind of longing, a desire to belong. The movie of the same title is about transformation and about what really matters to people and that is what has always interested me. I believe that we have no nature as such, but develop in and through art – so we’re really the product of ourselves. Here, too, “Working Girl” fits – it transports this idea of tireless and vigorous pursuit, even if the outcome is completely unclear. Isn’t that what’s beautiful about it?”” – Sies + Höke

    : Wuyishan

    Last time I was here in Wuyishan was about twenty years ago. I had only just started drinking tea more seriously back then, and it was also a much calmer place than now. This time, I know a little more about tea, and the place is a lot busier – it’s like a giant tea mall now, with almost every single shop in town selling yancha of some kind.

    On some level, most tea producing regions are the same in East Asia. They mostly work on a small farmer model, where individual farmer has a smallish holding of land with a limited production capacity. Calling them “plantation” would, for the most part, be a misnomer. These are family held farms, and most people here produce tea for their own sales.

    This is, however, also a big tourist attraction because of its nature, so there are a fair number of stores in town that have nothing to do with any farm – their trade is the tourist one, and they’re here to make some money selling people tea. Well, everyone’s here to make money selling tea, but in these cases they’re traders. Nothing wrong with that, although, as per the Longjing rule, one ought to remember that just because someone’s in the production area, it doesn’t mean they’re here making tea or have direct access to teas.

    I’m here to do research for my project. Thankfully, I have some contacts here through friends, and I’ve been to some tea farms and factories. Like I said, some things are about the same no matter what area you’re in. The Wuyishan area though has some unique properties.

    The thing is, Wuyi tea is divided roughly into three areas – zhengyan (proper rock, literally), banyan (half rock), and zhoucha (island tea). The names don’t make a lot of sense in translation. Zhengyan means teas in the proper Wuyi areas – inside the valleys and hills of the Wuyi mountains national park area. Banyan are the stuff right around the area – like the photo above, basically on the outskirts of the national park. Then you have zhoucha, which is now a term generally meaning stuff all around even further away. Zhoucha is regional stuff made in the style of Wuyi teas, whereas zhengyan stuff is the “real deal”. Prices, of course, are according to these regions. Run of the mill zhengyan stuff will easily run over a couple thousand RMB for 500g, more of course if you’re buying from someone who sourced it and is selling to the Western market. Better stuff would be multiples of that.

    Then there are these so called “special” areas – Niulan keng, Huiyuan keng, etc. There’s a lot of hype here, typical of the Chinese tea market these days. Prices can be sky high for some of these teas – like 20k USD a jin (500g) for some really special ones, even though these are sort of one of a kind trades. The more “regular” stuff can still run above ten thousand RMB a jin. It’s frankly pretty silly. Can the difference really be that great? Yes, I suppose, but in general, once you go above a certain price point… the incremental improvement in your experience is going to be marginal.

    Then of course there are the fakes – cheaper stuff faking to be better ones, usually. Nobody would believe you if you sell a non Wuyi tea as a top grade one, but a banyan being sold as zhengyan? If you’ve never had a bunch of both, you probably can’t tell the difference. Banyan is still good tea, but you’ll pretty much never see people advertise it as that, unfortunately. As always, tea is pretty anonymous, and it’s quite easy to try to dress up a tea as something else.

    It’s raining cats and dogs outside, and it’s 2:40am. Time for bed. Hopefully the rain stops and so I can go see some more teas tomorrow deeper into the park.

    OCaml Planet: Blockchains @ OCamlPro: an Overview

    OCamlPro started working on blockchains in 2014, when Arthur Breitman came to us with an initial idea to develop the Tezos ledger. The idea was very challenging with a lot of innovations. So, we collaborated with him to write a specification, and to turn the specification into OCaml code. Since then, we continually improved our skills in this domain, trained more engineers, introduced the technology to students and to professionals, advised a dozen projects, developed tools and libraries, made some improvements and extensions to the official Tezos node, and conducted several private deployments of the Tezos ledger.

    For an overview of OCamlPro’s blockchain activities see http://blockchains.ocamlpro.com

    TzScan: A complete Block Explorer for Tezos

    TzScan is considered today to be the best block explorer for Tezos. It’s made of three main components:

    • an indexer that queries the Tezos node and fills a relational database,
    • an API server that queries the database to retrieve various informations,
    • a web based user interface (a Javascript application)

    We deployed the indexer and API to freely provide the community with an access to all the content of the Tezos blockchain, already used by many websites, wallets and apps. In addition, we directly use this API within our TzScan.io instance. Our deployment spans on multiple Tezos nodes, multiple API servers and a distributed database to scale and reply to millions of queries per day. We also regularly release open source versions under the GPL license, that can be easily deployed on private Tezos networks. TzScan’s development has been initiated in September 2017. It represents today an enormous investment, that the Tezos Foundation helped partially fund in July 2018.

    Contact us for support, advanced features, advertisement, or if you need a private deployment of the TzScan infrastructure.

    Liquidity: a Smart Contract Language for Tezos

    Liquidity is the first high-level language for Tezos over Michelson. Its development began in April 2017, a few months before the Tezos fundraising in July 2017. It is today the most advanced language for Tezos: it offers OCaml-like and ReasonML-like syntaxes for writing smart contracts, compilation and de-compilation to/from Michelson, multiple-entry points, static type-checking à la ML, etc. Its online editor allows to develop smart contracts and to deploy them directly into the alphanet or mainnet. Liquidity has been used before the mainnet launch to de-compile the Foundation’s vesting smart contracts in order to review them. This smart contract language represents more than two years of work, and is fully funded by OCamlPro. It has been developed with formal verification in mind, formal verification being one of the selling points of Tezos. We have elaborated a detailed roadmap mixing model-checking and deductive program verification to investigate this feature. We are now searching for funding opportunities to keep developing and maintaining Liquidity.

    See our online editor to get started ! Contact us if you need support, training, writing or in-depth analysis of your smart contracts.

    Techelson: a testing framework for Michelson and Liquidity

    Techelson is our newborn in the set of tools for the Tezos blockchain. It is a test execution engine for the functional properties of Michelson and Liquidity contracts. Techelson is still in its early development stage. The user documentation is available here. An example on how to use it with Liquidity is detailed in this post.

    Contact us to customize the engine to suit your own needs!

    IronTez: an optimized Tezos node by OCamlPro

    IronTez is a tailored node for private (and public) deployments of Tezos. Among its additional features, the node adds some useful RPCs, improves storage, enables garbage collection and context pruning, allows an easy configuration of the private network, provides additional Michelson instructions (GET_STORAGE, CATCH…). One of its nice features is the ability to enable adaptive baking in private / proof-of-authority setting (eg. baking every 5 seconds in presence of transactions and every 10 minutes otherwise, etc.).

    A simplified version of IronTez has already been made public to allow testing its improved storage system, Ironmin, showing a 10x reduction in storage. Some TzScan.io nodes are also using versions of IronTez. We’ve also successfully deployed it along with TzScan for a big foreign company to experiment with private blockchains. We are searching for projects and funding opportunities to keep developing and maintaining this optimized version of the Tezos node.

    Don’t hesitate to contact us if you want to deploy a blockchain with IronTez, or for more information !

    Looking for more services from OCamlPro?

    • Custom adaptation of our tools for your specific needs: TzScan, Liquidity, IronTez, Techelson
    • Research projects, on verification of Tezos smart contracts
    • Development of smart contracts fitted to your needs
    • In-depth and comprehensive analysis of your smart contracts
    • Training Liquidity or Tezos sessions for your team
    • Prototyping and development of blockchains related solutions
    • Development of new protocol amendments for Tezos
    • Deployment of private Tezos networks
    • Development of full end-to-end blockchain applications
    • Tailored services for Tezos : node hosting, premium TzScan APIs, etc.

    Services around other blockchains

    Though we have built a huge experience of Tezos, our work also benefits from studying and experimenting other blockchains, such as Bitcoin or Ethererum. For example, we prototyped and developed the Tezos fundraising platform from the ground up, encompassing the web app (back-end and front-end), the Ethereum and Bitcoin (p2sh) multi-signature contracts, as well as the hardware Ledger based process for transferring funds. We also helped prototype and conduct research for some of our partners on the Bitcoin and Ethereum blockchains, and we are also following the recent developments on emerging blockchains like Cardano and Cosmos. Currently, we are deploying a notarization service on top of multiple blockchains.

    We can help develop and deploy applications interacting with Bitcoin and Ethereum Blockchains.

    The Blockchains@OCamlPro page: http://blockchains.ocamlpro.com

    Lambda the Ultimate - Programming Languages Weblog: Seven Sketches in Compositionality: An Invitation to Applied Category Theory

    Seven Sketches in Compositionality: An Invitation to Applied Category Theory

    2018 by Brendan Fong and David I. Spivak

    Category theory is becoming a central hub for all of pure mathematics. It is unmatched in its ability to organize and layer abstractions, to find commonalities between structures of all sorts, and to facilitate communication between different mathematical communities. But it has also been branching out into science, informatics, and industry. We believe that it has the potential to be a major cohesive force in the world, building rigorous bridges between disparate worlds, both theoretical and practical. The motto at MIT is mens et manus, Latin for mind and hand. We believe that category theory—and pure math in general—has stayed in the realm of mind for too long; it is ripe to be brought to hand.
    A very approachable but useful introduction to category theory. It avoids the Scylla and Charybdis of becoming incomprehensible after page 2 (as many academic texts do), and barely scratching the surface (as many popular texts do).

    Daniel Lemire's blog: Science and Technology links (April 27th 2019)

    1. Women who use oral contraceptives have a harder time recognizing emotions of others.
    2. Worldwide, livestock has ten times the mass of wild animals and nearly twice the mass of human beings. Fishes have more than ten times the mass of human beings.
    3. Using software, we can map brain activity to speech so that listeners can easily identify words (source: Nature).
    4. Aging is associated with a reduction in tissue NAD levels. In turn, this is believed to be associated with physical decline. It is not entirely clear what makes the NAD level decline, but it seems that it might be caused by the accumulation of senescent cells. As we age, we tend to accumulate these non-functional and slightly harmful “dead” cells called “senescent cells”. We now have drugs called senolytics that can remove some of these senescent cells, with at least one ongoing clinical trial.

    Tea Masters: Mountains of glass and steel

    NYC is a city where men have created mountains of glass and steel. Even coming from a big city like Taipei with its 101 bamboo shaped building, it's hard not to be fascinated by New York's skyline. 
    This is especially the case at night or in the early hours of a sunny morning! I wonder if these buildings have inspired the name of the Floating Mountain Tea House. In any case, I find this an excellent tea house name in this city! And that's one reason why I was eager to meet with Scott Norton there.
    (I'm sorry if I didn't contact all my blog readers in the NYC area. These trips always receive a very late approval and it's difficult to plan many or large events. That's why it made sense to meet someone who is very dedicated to tea education and share some of my techniques so that he can pass it onward to a large number of people.)
    So, I brought my little gold coated silver teapot and I let Scott play with it! He's a fast learner and poured with calm and dexterity.
    We tasted three very different teas. A green tea from the tea house, a lapsang souchong and my early 1990s green mark. All three felt particularly pure and light brewed in the silver teapot.
    It's as if the teas were under a microscope: all their scents were intensified. We also experimented with different cups to see how they affect taste and color. The ivory hue went really well with the red tea and the aged sheng puerh.
    I still feel that the best place to brew tea is at home where I have all my teas and accessories (or in nature), but such a tea house is a really nice place to have tea with a friend when the home is not an option. And it's also the opportunity to meet other tea drinkers at events hosted by the tea house and where Scott is the instructor. If you're new to the tea scene, I think it's a great way to learn.
    And it's also a place where you can find inspiration for some simple and beautiful flower arrangements! Because the taste tea is always a taste of nature...
    I wish Scott and the Floating Mountain tea house success in spreading traditional tea culture in NYC. It's not an easy task in a city where the water quality is more suited to brew coffee than fine teas!
    But when the air is clear and the sky is blue, it feels we're walking among mountains of glass and steel! That's when I feel almost electrified by the energy of city! It's a similar feeling to a sunrise in Alishan or Lishan.
    Bryant Park
    Even in the big city, you'll find the green beauty of nature...
    Central Park
    ... and a longing for a good cup of tea.

    Daniel Lemire's blog: Speeding up a random-access function?

    A common problem in software performance is that you are essentially limited by memory access. Let us consider such a function where you write at random locations in a big array.

     for ( i = 0; i < N; i++) {
        // hash is a randomized hash function
        bigarray[hash(i)] = i; 
     }
    

    This is a good model for how one might construct a large hash table, for example.

    It can be terribly slow if the big array is really large because each and every access is likely to be an expensive cache miss.

    Can you speed up this function?

    It is difficult, but you might be able to accelerate it a bit, or maybe more than a bit. However, it will involve doing extra work.

    Here is a strategy which works, if you do it just right. Divide your big array into regions. For each region, create a stack. Instead of writing directly to the big array, when you are given a hash value, locate the corresponding stack, and append the hash value to it. Then, later, go through the stacks and apply them to the big array.

     
     for ( i = 0; i < N; i++) {
        loc = hash(i)
        add loc, i to buffer[loc / bucketsize]
     }
     for each buffer {
       for each loc,i in buffer
         bigarray[loc] = i
     }
    

    It should be clear that this second strategy is likely to save some expensive cache misses. Indeed, during the first phase, we append to only a few stacks: the top of each stack is likely to be in cache because we have few stacks. Then when you unwind the stacks, you are writing in random order, but within a small region of the big array.

    This is a standard “external memory” algorithm: people used to design a lot of these algorithms when everything was on disk and when disks were really slow.

    So how well do I do? Here are my results on a Skylake processor using GNU GCC 8 (with -O3 -march=native, THP set to madvise).

    cycles/access instructions/access
    standard 57 13
    buffered 45 36

    So while the buffered version I coded uses three times as many instructions, and while it needs to allocate a large buffer, it still comes up on top.

    My code is available.

    Charles Petzold: Reading Ian McEwan’s “Machines Like Me”

    It is 1982, and England has just suffered a devastating defeat in the Falklands War. Argentina had managed to obtain some Exocet Series 8 missiles equipped with A.I. targeting software that inflicted enormous damage on Britain’s warships and left almost 3,000 dead. This software, however, had an innocent origin: It was based on open-source facial-recognition algorithms that Alan Turing had designed in the 1960s to help people with the face-blindness condition known as prosopagnosia.

    ... more ...

    OCaml Planet: MirageOS security advisory 02: mirage-xen < 3.3.0

    MirageOS Security Advisory 02 - grant unshare vulnerability in mirage-xen

    • Module: mirage-xen
    • Announced: 2019-04-25
    • Credits: Thomas Leonard, Mindy Preston
    • Affects: mirage-xen < 3.3.0, mirage-block-xen < 1.6.1, mirage-net-xen < 1.10.2, mirage-console < 2.4.2, ocaml-vchan < 4.0.2, ocaml-gnt (no longer supported)
    • Corrected: 2019-04-22: mirage-xen 3.4.0, 2019-04-05: mirage-block-xen 1.6.1, 2019-04-02: mirage-net-xen 1.10.2, 2019-03-27: mirage-console 2.4.2, 2019-03-27: ocaml-vchan 4.0.2

    For general information regarding MirageOS Security Advisories, please visit https://mirage.io/security.

    Background

    MirageOS is a library operating system using cooperative multitasking, which can be executed as a guest of the Xen hypervisor. Virtual machines running on a Xen host can communicate by sharing pages of memory. For example, when a Mirage VM wants to use a virtual network device provided by a Linux dom0:

    1. The Mirage VM reserves some of its memory for this purpose and writes an entry to its grant table to say that dom0 should have access to it.
    2. The Mirage VM tells dom0 (via XenStore) about the grant.
    3. dom0 asks Xen to map the memory into its address space.

    The Mirage VM and dom0 can now communicate using this shared memory. When dom0 has finished with the memory:

    1. dom0 tells Xen to unmap the memory from its address space.
    2. dom0 tells the Mirage VM that it no longer needs the memory.
    3. The Mirage VM removes the entry from its grant table.
    4. The Mirage VM may reuse the memory for other purposes.

    Problem Description

    Mirage removes the entry by calling the gnttab_end_access function in Mini-OS. This function checks whether the remote domain still has the memory mapped. If so, it returns 0 to indicate that the entry cannot be removed yet. To make this function available to OCaml code, the stub_gntshr_end_access C stub in mirage-xen wrapped this with the OCaml calling conventions. Unfortunately, it ignored the return code and reported success in all cases.

    Impact

    A malicious VM can tell a MirageOS unikernel that it has finished using some shared memory while it is still mapped. The Mirage unikernel will think that the unshare operation has succeeded and may reuse the memory, or allow it to be garbage collected. The malicious VM will still have access to the memory.

    In many cases (such as in the example above) the remote domain will be dom0, which is already fully trusted. However, if a unikernel shares memory with an untrusted domain then there is a problem.

    Workaround

    No workaround is available.

    Solution

    Returning the result from the C stub required changes to the OCaml grant API to deal with the result. This turned out to be difficult because, for historical reasons, the OCaml part of the API was in the ocaml-gnt package while the C stubs were in mirage-xen, and because the C stubs are also shared with the Unix backend.

    We instead created a new grant API in mirage-xen, migrated all existing Mirage drivers to use it, and then dropped support for the old API. mirage-xen 3.3.0 added support for the new API and 3.4.0 removed support for the old one.

    The recommended way to upgrade is:

      opam update
    opam upgrade mirage-xen
    

    Correction details

    The following PRs were part of the fix:

    References

    You can find the latest version of this advisory online at https://mirage.io/blog/MSA02.

    This advisory is signed using OpenPGP, you can verify the signature by downloading our public key from a keyserver (gpg --recv-key 4A732D757C0EDA74), downloading the raw markdown source of this advisory from GitHub and executing gpg --verify 02.txt.asc.

    The Shape of Code: Dimensional analysis of the Halstead metrics

    One of the driving forces behind the Halstead complexity metrics was physics envy; the early reports by Halstead use the terms software physics and software science.

    One very simple, and effective technique used by scientists and engineers to check whether an equation makes sense, is dimensional analysis. The basic idea is that when performing an operation between two variables, their measurement units must be consistent; for instance, two lengths can be added, but a length and a time cannot be added (a length can be divided by time, returning distance traveled per unit time, i.e., velocity).

    Let’s run a dimensional analysis check on the Halstead equations.

    The input variables to the Halstead metrics are: eta_1, the number of distinct operators, eta_2, the number of distinct operands, N_1, the total number of operators, and N_2, the total number of operands. These quantities can be interpreted as units of measurement in tokens.

    The formula are:

    • Program length: N = N_1 + N_2
      There is a consistent interpretation of this equation: operators and operands are both kinds of tokens, and number of tokens can be interpreted as a length.
    • Calculated program length: hat{N} = eta_1 log_2 eta_1 + eta_2 log_2 eta_2
      There is a consistent interpretation of this equation: the operand of a logarithm has to be dimensionless, and the convention is to treat the operand as a ratio (if no denominator is specified, the value 1 having the same dimensions as the numerator is taken, giving a dimensionless result), the value returned is dimensionless, which can be multiplied by a variable having any kind of dimension; so again two (token) lengths are being added.
    • Volume: V = N * log_2 eta
      A volume has units of length^3 (i.e., it is created by multiplying three lengths). There is only one length in this equation; the equation is misnamed, it is a length.
    • Difficulty: D = {eta_1 / 2 } * {N_2 / eta_2}
      Here the dimensions of eta_1 and eta_2 cancel, leaving the dimensions of N_2 (a length); now Halstead is interpreting length as a difficulty unit (whatever that might be).
    • Effort: E =  D * V
      This equation multiplies two variables, both having a length dimension; the result should be interpreted as an area. In physics work is force times distance, and power is work per unit time; the term effort is not defined.

    Halstead is claiming that a single dimension, program length, contains so much unique information that it can be used as a measure of a variety of disparate quantities.

    Halstead’s colleagues at Purdue were rather damming in their analysis of these metrics. Their report Software Science Revisited: A Critical Analysis of the Theory and Its Empirical Support points out the lack of any theoretical foundation for some of the equations, that the analysis of the data was weak and that a more thorough analysis suggests theory and data don’t agree.

    I pointed out in an earlier post, that people use Halstead’s metrics because everybody else does. This post is unlikely to change existing herd behavior, but it gives me another page to point people at, when people ask why I laugh at their use of these metrics.

    CreativeApplications.Net: Neuhaus.world – Participatory, interactive music video for Lake of Pavement

    Neuhaus.world – Participatory, interactive music video for Lake of Pavement
    Created by Moniker, Neuhaus.world is a participatory, interactive music video for Lake of Pavement, a new track by the emerging Rotterdam based artist Jo Goes Hunting.

    things magazine: Away days

    Historical Times presents a collection of culturally important ruins / Mars in the Gobi Desert / how climate change deniers make themselves seem legitimate / Island living. See also The Gatekeepers, a photography project by Alex Ingram / Asia, through … Continue reading

    The Shape of Code: C2X and undefined behavior

    The ISO C Standard is currently being revised by WG14, to create C2X.

    There is a rather nebulous clustering of people who want to stop compilers using undefined behaviors to generate what these people (and probably most other developers) consider to be very surprising code. For instance, always printing p is truep is false, when executing the code: bool p; if ( p ) printf("p is true"); if ( !p ) printf("p is false"); (possible because p is uninitialized, and accessing an uninitialized value is undefined behavior).

    This sounds like a good thing; nobody wants compilers generating surprising code.

    All the proposals I have seen, so far, involve doing away with constructs that can produce undefined behavior. Again, this sounds like a good thing; nobody likes undefined behaviors.

    The problem is, there is a reason for labeling certain constructs as producing undefined behavior; the behavior is who-knows-what.

    Now the C Standard could specify the who-knows-what behavior; for instance, it could specify that the result of dividing by zero is 42. Standard’s conforming compilers would then have to generate code to check whether the denominator was zero, and return 42 for this case (until Intel, ARM and other processor vendors ‘updated’ the behavior of their divide instructions). Way-back-when a design decision was made, the behavior of divide by zero is undefined, not 42 or any other value; this was a design decision, code efficiency and compactness was considered to be more important.

    I have not seen anybody arguing that the behavior of divide by zero should be specified. But I have seen people arguing that once C’s integer representation is specified as being twos-compliment (currently it can also be ones-compliment or signed-magnitude), then arithmetic overflow becomes defined. Wrong.

    Twos-compliment is a specification of a representation, not a specification of behavior. What is the behavior when the result of adding two integers cannot be represented? The result might be to wrap (the behavior expected by many developers), to saturate at the maximum value (frequently needed in image and signal processing), to raise a signal (overflow is not usually supposed to happen), or something else.

    WG14 could define the behavior, for when the result of an arithmetic operation is not representable in the number of bits available. Standard’s conforming compilers targeting processors whose arithmetic instructions did not behave as required would have to generate code, for any operation that could overflow, to do what was necessary. The embedded market are heavy users of C; in this market memory is limited, and processor performance is never fast enough, the overhead of supporting a defined behavior could just be too high (a more attractive solution is to code review, to make sure the undefined behavior cannot occur).

    Is there another way of addressing the issue of compiler writers’ use/misuse of undefined behavior? Yes, offer them money. Compiler writing is a business, at least at the level at which gcc and llvm operate. If people really are keen to influence the code generated by gcc and llvm, money is the solution. Wot, no money? Then stop complaining.

    Planet Lisp: Paul Khuong: Fractional Set Covering With Experts

    Last winter break, I played with one of the annual capacitated vehicle routing problem (CVRP) “Santa Claus” contests. Real world family stuff took precedence, so, after the obvious LKH with Concorde polishing for individual tours, I only had enough time for one diversification moonshot. I decided to treat the high level problem of assembling prefabricated routes as a set covering problem: I would solve the linear programming (LP) relaxation for the min-cost set cover, and use randomised rounding to feed new starting points to LKH. Add a lot of luck, and that might just strike the right balance between solution quality and diversity.

    Unsurprisingly, luck failed to show up, but I had ulterior motives: I’m much more interested in exploring first order methods for relaxations of combinatorial problems than in solving CVRPs. The routes I had accumulated after a couple days turned into a set covering LP with 1.1M decision variables, 10K constraints, and 20M nonzeros. That’s maybe denser than most combinatorial LPs (the aspect ratio is definitely atypical), but 0.2% non-zeros is in the right ballpark.

    As soon as I had that fractional set cover instance, I tried to solve it with a simplex solver. Like any good Googler, I used Glop... and stared at a blank terminal for more than one hour.

    Having observed that lack of progress, I implemented the toy I really wanted to try out: first order online “learning with experts” (specifically, AdaHedge) applied to LP optimisation. I let this not-particularly-optimised serial CL code run on my 1.6 GHz laptop for 21 hours, at which point the first order method had found a 4.5% infeasible solution (i.e., all the constraints were satisfied with \(\ldots \geq 0.955\) instead of \(\ldots \geq 1\)). I left Glop running long after the contest was over, and finally stopped it with no solution after more than 40 days on my 2.9 GHz E5.

    Given the shape of the constraint matrix, I would have loved to try an interior point method, but all my licenses had expired, and I didn’t want to risk OOMing my workstation. Erling Andersen was later kind enough to test Mosek’s interior point solver on it. The runtime was much more reasonable: 10 minutes on 1 core, and 4 on 12 cores, with the sublinear speed-up mostly caused by the serial crossover to a simplex basis.

    At 21 hours for a naïve implementation, the “learning with experts” first order method isn’t practical yet, but also not obviously uninteresting, so I’ll write it up here.

    Using online learning algorithms for the “experts problem” (e.g., Freund and Schapire’s Hedge algorithm) to solve linear programming feasibility is now a classic result; Jeremy Kun has a good explanation on his blog. What’s new here is:

    1. Directly solving the optimisation problem.
    2. Confirming that the parameter-free nature of AdaHedge helps.

    The first item is particularly important to me because it’s a simple modification to the LP feasibility meta-algorithm, and might make the difference between a tool that’s only suitable for theoretical analysis and a practical approach.

    I’ll start by reviewing the experts problem, and how LP feasibility is usually reduced to the former problem. After that, I’ll cast the reduction as a surrogate relaxation method, rather than a Lagrangian relaxation; optimisation should flow naturally from that point of view. Finally, I’ll guess why I had more success with AdaHedge this time than with Multiplicative Weight Update eight years ago.1

    The experts problem and LP feasibility

    I first heard about the experts problem while researching dynamic sorted set data structures: Igal Galperin’s PhD dissertation describes scapegoat trees, but is really about online learning with experts. Arora, Hazan, and Kale’s 2012 survey of multiplicative weight update methods. is probably a better introduction to the topic ;)

    The experts problem comes in many variations. The simplest form sounds like the following. Assume you’re playing a binary prediction game over a predetermined number of turns, and have access to a fixed finite set of experts at each turn. At the beginning of every turn, each expert offers their binary prediction (e.g., yes it will rain today, or it will not rain today). You then have to make a prediction yourself, with no additional input. The actual result (e.g., it didn’t rain today) is revealed at the end of the turn. In general, you can’t expect to be right more often than the best expert at the end of the game. Is there a strategy that bounds the “regret,” how many more wrong prediction you’ll make compared to the expert(s) with the highest number of correct predictions, and in what circumstances?

    Amazingly enough, even with an omniscient adversary that has access to your strategy and determines both the experts’ predictions and the actual result at the end of each turn, a stream of random bits (hidden from the adversary) suffice to bound our expected regret in \(\mathcal{O}(\sqrt{T}\,\lg n)\), where \(T\) is the number of turns and \(n\) the number of experts.

    I long had trouble with that claim: it just seems too good of a magic trick to be true. The key realisation for me was that we’re only comparing against invidivual experts. If each expert is a move in a matrix game, that’s the same as claiming you’ll never do much worse than any pure strategy. One example of a pure strategy is always playing rock in Rock-Paper-Scissors; pure strategies are really bad! The trick is actually in making that regret bound useful.

    We need a more continuous version of the experts problem for LP feasibility. We’re still playing a turn-based game, but, this time, instead of outputting a prediction, we get to “play” a mixture of the experts (with non-negative weights that sum to 1). At the beginning of each turn, we describe what weight we’d like to give to each experts (e.g., 60% rock, 40% paper, 0% scissors). The cost (equivalently, payoff) for each expert is then revealed (e.g., \(\mathrm{rock} = -0.5\), \(\mathrm{paper} = 0.5\), \(\mathrm{scissors} = 0\)), and we incur the weighted average from our play (e.g., \(60\% \cdot -0.5 + 40\% \cdot 0.5 = -0.1\)) before playing the next round.2 The goal is to minimise our worst-case regret, the additive difference between the total cost incurred by our mixtures of experts and that of the a posteriori best single expert. In this case as well, online learning algorithms guarantee regret in \(\mathcal{O}(\sqrt{T} \, \lg n)\)

    This line of research is interesting because simple algorithms achieve that bound, with explicit constant factors on the order of 1,3 and those bounds are known to be non-asymptotically tight for a large class of algorithms. Like dense linear algebra or fast Fourier transforms, where algorithms are often compared by counting individual floating point operations, online learning has matured into such tight bounds that worst-case regret is routinely presented without Landau notation. Advances improve constant factors in the worst case, or adapt to easier inputs in order to achieve “better than worst case” performance.

    The reduction below lets us take any learning algorithm with an additive regret bound, and convert it to an algorithm with a corresponding worst-case iteration complexity bound for \(\varepsilon\)-approximate LP feasibility. An algorithm that promises low worst-case regret in \(\mathcal{O}(\sqrt{T})\) gives us an algorithm that needs at most \(\mathcal{O}(1/\varepsilon\sp{2})\) iterations to return a solution that almost satisfies every constraint in the linear program, where each constraint is violated by \(\varepsilon\) or less (e.g., \(x \leq 1\) is actually \(x \leq 1 + \varepsilon\)).

    We first split the linear program in two components, a simple domain (e.g., the non-negative orthant or the \([0, 1]\sp{d}\) box) and the actual linear constraints. We then map each of the latter constraints to an expert, and use an arbitrary algorithm that solves our continuous version of the experts problem as a black box. At each turn, the black box will output a set of non-negative weights for the constraints (experts). We will average the constraints using these weights, and attempt to find a solution in the intersection of our simple domain and the weighted average of the linear constraints.

    Let’s use Stigler’s Diet Problem with three foods and two constraints as a small example, and further simplify it by disregarding the minimum value for calories, and the maximum value for vitamin A. Our simple domain here is at least the non-negative orthant: we can’t ingest negative food. We’ll make things more interesting by also making sure we don’t eat more than 10 servings of any food per day.

    The first constraint says we mustn’t get too many calories

    \[72 x\sb{\mathrm{corn}} + 121 x\sb{\mathrm{milk}} + 65 x\sb{\mathrm{bread}} \leq 2250,\]

    and the second constraint (tweaked to improve this example) ensures we ge enough vitamin A

    \[107 x\sb{\mathrm{corn}} + 400 x\sb{\mathrm{milk}} \geq 5000,\]

    or, equivalently,

    \[-107 x\sb{\mathrm{corn}} - 400 x\sb{\mathrm{milk}} \leq -5000,\]

    Given weights \([¾, ¼]\), the weighted average of the two constraints is

    \[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\]

    where the coefficients for each variable and for the right-hand side were averaged independently.

    The subproblem asks us to find a feasible point in the intersection of these two constraints: \[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

    Classically, we claim that this is just Lagrangian relaxation, and find a solution to

    \[\min 27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

    In the next section, I’ll explain why I think this analogy is wrong and worse than useless. For now, we can easily find the maximum one variable at a time, and find the solution \(x\sb{\mathrm{corn}} = 0\), \(x\sb{\mathrm{milk}} = 10\), \(x\sb{\mathrm{bread}} = 0\), with objective value \(-92.5\) (which is \(530\) less than \(437.5\)).

    In general, three things can happen at this point. We could discover that the subproblem is infeasible. In that case, the original non-relaxed linear program itself is infeasible: any solution to the original LP satisfies all of its constraints, and thus would also satisfy any weighted average of the same constraints. We could also be extremely lucky and find that our optimal solution to the relaxation is (\(\varepsilon\)-)feasible for the original linear program; we can stop with a solution. More commonly, we have a solution that’s feasible for the relaxation, but not for the original linear program.

    Since that solution satisfies the weighted average constraint, the black box’s payoff for this turn (and for every other turn) is non-positive. In the current case, the first constraint (on calories) is satisfied by \(1040\), while the second (on vitamin A) is violated by \(1000\). On weighted average, the constraints are satisfied by \(\frac{1}{4}(3 \cdot 1040 - 1000) = 530.\) Equivalently, they’re violated by \(-530\) on average.

    We’ll add that solution to an accumulator vector that will come in handy later.

    The next step is the key to the reduction: we’ll derive payoffs (negative costs) for the black box from the solution to the last relaxation. Each constraint (expert) has a payoff equal to its level of violation in the relaxation’s solution. If a constraint is strictly satisfied, the payoff is negative; for example, the constraint on calories is satisfied by \(1040\), so its payoff this turn is \(-1040\). The constraint on vitamin A is violated by \(1000\), so its payoff this turn is \(1000\). Next turn, we expect the black box to decrease the weight of the constraint on calories, and to increase the weight of the one on vitamin A.

    After \(T\) turns, the total payoff for each constraint is equal to the sum of violations by all solutions in the accumulator. Once we divide both sides by \(T\), we find that the divided payoff for each constraint is equal to its violation by the average of the solutions in the accumulator. For example, if we have two solutions, one that violates the calories constraint by \(500\) and another that satisfies it by \(1000\) (violates it by \(-1000\)), the total payoff for the calories constraint is \(-500\), and the average of the two solutions does strictly satisfy the linear constraint by \(\frac{500}{2} = 250\)!

    We also know that we only generated feasible solutions to the relaxed subproblem (otherwise, we’d have stopped and marked the original LP as infeasible), so the black box’s total payoff is \(0\) or negative.

    Finally, we assumed that the black box algorithm guarantees an additive regret in \(\mathcal{O}(\sqrt{T}\, \lg n)\), so the black box’s payoff of (at most) \(0\) means that any constraint’s payoff is at most \(\mathcal{O}(\sqrt{T}\, \lg n)\). After dividing by \(T\), we obtain a bound on the violation by the arithmetic mean of all solutions in the accumulator: for all constraint, that violation is in \(\mathcal{O}\left(\frac{\lg n}{\sqrt{T}}\right)\). In other words, the number of iteration \(T\) must scale with \(\mathcal{O}\left(\frac{\lg n}{\varepsilon\sp{2}}\right)\), which isn’t bad when \(n\) is in the millions but \(\varepsilon \approx 0.01\).

    Theoreticians find this reduction interesting because there are concrete implementations of the black box, e.g., the multiplicative weight update (MWU) method with non-asymptotic bounds. For many problems, this makes it possible to derive the exact number of iterations necessary to find an \(\varepsilon-\)feasible fractional solution, given \(\varepsilon\) and the instance’s size (but not the instance itself).

    That’s why algorithms like MWU are theoretically useful tools for fractional approximations, when we already have subgradient methods that only need \(\mathcal{O}\left(\frac{1}{\varepsilon}\right)\) iterations: state-of-the-art algorithms for learning with experts explicit non-asymptotic regret bounds that yield, for many problems, iteration bounds that only depend on the instance’s size, but not its data. While the iteration count when solving LP feasibility with MWU scales with \(\frac{1}{\varepsilon\sp{2}}\), it is merely proportional to \(\lg n\), the log of the the number of linear constraints. That’s attractive, compared to subgradient methods for which the iteration count scales with \(\frac{1}{\varepsilon}\), but also scales linearly with respect to instance-dependent values like the distance between the initial dual solution and the optimum, or the Lipschitz constant of the Lagrangian dual function; these values are hard to bound, and are often proportional to the square root of the number of constraints. Given the choice between \(\mathcal{O}\left(\frac{\lg n}{\varepsilon\sp{2}}\right)\) iterations with explicit constants, and a looser \(\mathcal{O}\left(\frac{\sqrt{n}}{\varepsilon}\right)\), it’s obvious why MWU and online learning are powerful additions to the theory toolbox.

    Theoreticians are otherwise not concerned with efficiency, so the usual answer to someone asking about optimisation is to tell them they can always reduce linear optimisation to feasibility with a binary search on the objective value. I once made the mistake of implementing that binary search last strategy. Unsurprisingly, it wasn’t useful. I also tried another theoretical reduction, where I looked for a pair of primal and dual -feasible solutions that happened to have the same objective value. That also failed, in a more interesting manner: since the two solution had to have almost the same value, the universe spited me by sending back solutions that were primal and dual infeasible in the worst possible way. In the end, the second reduction generated fractional solutions that were neither feasible nor superoptimal, which really isn’t helpful.

    Direct linear optimisation with experts

    The reduction above works for any “simple” domain, as long as it’s convex and we can solve the subproblems, i.e., find a point in the intersection of the simple domain and a single linear constraint or determine that the intersection is empty.

    The set of (super)optimal points in some initial simple domain is still convex, so we could restrict our search to the search of the domain that is superoptimal for the linear program we wish to optimise, and directly reduce optimisation to the feasibility problem solved in the last section, without binary search.

    That sounds silly at first: how can we find solutions that are superoptimal when we don’t even know the optimal value?

    Remember that the subproblems are always relaxations of the original linear program. We can port the objective function from the original LP over to the subproblems, and optimise the relaxations. Any solution that’s optimal for a realxation must have an optimal or superoptimal value for the original LP.

    Rather than treating the black box online solver as a generator of Lagrangian dual vectors, we’re using its weights as solutions to the surrogate relaxation dual. The latter interpretation isn’t just more powerful by handling objective functions. It also makes more sense: the weights generated by algorithms for the experts problem are probabilities, i.e., they’re non-negative and sum to \(1\). That’s also what’s expected for surrogate dual vectors, but definitely not the case for Lagrangian dual vectors, even when restricted to \(\leq\) constraints.

    We can do even better!

    Unlike Lagrangian dual solvers, which only converge when fed (approximate) subgradients and thus make us (nearly) optimal solutions to the relaxed subproblems, our reduction to the experts problem only needs feasible solutions to the subproblems. That’s all we need to guarantee an \(\varepsilon-\)feasible solution to the initial problem in a bounded number of iterations. We also know exactly how that \(\varepsilon-\)feasible solution is generated: it’s the arithmetic mean of the solutions for relaxed subproblems.

    This lets us decouple finding lower bounds from generating feasible solutions that will, on average, \(\varepsilon-\)satisfy the original LP. In practice, the search for an \(\varepsilon-\)feasible solution that is also superoptimal will tend to improve the lower bound. However, nothing forces us to evaluate lower bounds synchronously, or to only use the experts problem solver to improve our bounds.

    We can find a new bound from any vector of non-negative constraint weights: they always yield a valid surrogate relaxation. We can solve that relaxation, and update our best bound when it’s improved. The Diet subproblem earlier had

    \[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

    Adding the original objective function back yields the linear program

    \[\min 0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\] subject to \[27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}} \leq 437.5,\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10,\]

    which has a trivial optimal solution at \([0, 0, 0]\).

    When we generate a feasible solution for the same subproblem, we can use any valid bound on the objective value to find the most feasible solution that is also assuredly (super)optimal. For example, if some oracle has given us a lower bound of \(2\) for the original Diet problem, we can solve for

    \[\min 27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\leq 2\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

    We can relax the objective value constraint further, since we know that the final \(\varepsilon-\)feasible solution is a simple arithmetic mean. Given the same best bound of \(2\), and, e.g., a current average of \(3\) solutions with a value of \(1.9\), a new solution with an objective value of \(2.3\) (more than our best bound, so not necessarily optimal!) would yield a new average solution with a value of \(2\), which is still (super)optimal. This means we can solve the more relaxed subproblem

    \[\min 27.25 x\sb{\mathrm{corn}} - 9.25 x\sb{\mathrm{milk}} + 48.75 x\sb{\mathrm{bread}}\] subject to \[0.18 x\sb{\mathrm{corn}} + 0.23 x\sb{\mathrm{milk}} + 0.05 x\sb{\mathrm{bread}}\leq 2.3\] \[0 \leq x\sb{\mathrm{corn}},\, x\sb{\mathrm{milk}},\, x\sb{\mathrm{bread}} \leq 10.\]

    Given a bound on the objective value, we swapped the constraint and the objective; the goal is to maximise feasibility, while generating a solution that’s “good enough” to guarantee that the average solution is still (super)optimal.

    For box-constrained linear programs where the box is the convex domain, subproblems are bounded linear knapsacks, so we can simply stop the greedy algorithm as soon as the objective value constraint is satisfied, or when the knapsack constraint becomes active (we found a better bound).

    This last tweak doesn’t just accelerate convergence to \(\varepsilon-\)feasible solutions. More importantly for me, it pretty much guarantees that out \(\varepsilon-\)feasible solution matches the best known lower bound, even if that bound was provided by an outside oracle. Bundle methods and the Volume algorithm can also mix solutions to relaxed subproblems in order to generate \(\varepsilon-\)feasible solutions, but the result lacks the last guarantee: their fractional solutions are even more superoptimal than the best bound, and that can make bounding and variable fixing difficult.

    The secret sauce: AdaHedge

    Before last Christmas’s CVRP set covering LP, I had always used the multiplicative weight update (MWU) algorithm as my black box online learning algorithm: it wasn’t great, but I couldn’t find anything better. The two main downsides for me were that I had to know a “width” parameter ahead of time, as well as the number of iterations I wanted to run.

    The width is essentially the range of the payoffs; in our case, the potential level of violation or satisfaction of each constraints by any solution to the relaxed subproblems. The dependence isn’t surprising: folklore in Lagrangian relaxation also says that’s a big factor there. The problem is that the most extreme violations and satisfactions are initialisation parameters for the MWU algorithm, and the iteration count for a given \(\varepsilon\) is quadratic in the width (\(\mathrm{max_violation} \cdot \mathrm{max_satisfaction}\)).

    What’s even worse is that the MWU is explicitly tuned for a specific iteration count. If I estimate that, give my worst-case width estimate, one million iterations will be necessary to achieve \(\varepsilon-\)feasibility, MWU tuned for 1M iterations will need 1M iterations, even if the actual width is narrower.

    de Rooij and others published AdaHedge in 2013, an algorithm that addresses both these issues by smoothly estimating its parameter over time, without using the doubling trick.4 AdaHedge’s loss (convergence rate to an \(\varepsilon-\)solution) still depends on the relaxation’s width. However, it depends on the maximum width actually observed during the solution process, and not on any explicit worst-case bound. It’s also not explicily tuned for a specific iteration count, and simply keeps improving at a rate that roughly matches MWU. If the instance happens to be easy, we will find an \(\varepsilon-\)feasible solution more quickly. In the worst case, the iteration count is never much worse than that of an optimally tuned MWU.

    These 400 lines of Common Lisp implement AdaHedge and use it to optimise the set covering LP. AdaHedge acts as the online blackk box solver for the surrogate dual problem, the relaxed set covering LP is a linear knapsack, and each subproblem attempts to improve the lower bound before maximising feasibility.

    When I ran the code, I had no idea how long it would take to find a feasible enough solution: covering constraints can never be violated by more than \(1\), but some points could be covered by hundreds of tours, so the worst case satisfaction width is high. I had to rely on the way AdaHedge adapts to the actual hardness of the problem. In the end, \(34492\) iterations sufficed to find a solution that was \(4.5\%\) infeasible.5 This corresponds to a worst case with a width of less than \(2\), which is probably not what happened. It seems more likely that the surrogate dual isn’t actually an omniscient adversary, and AdaHedge was able to exploit some of that “easiness.”

    The iterations themselves are also reasonable: one sparse matrix / dense vector multiplication to convert surrogate dual weights to an average constraint, one solve of the relaxed LP, and another sparse matrix / dense vector multiplication to compute violations for each constraint. The relaxed LP is a fractional \([0, 1]\) knapsack, so the bottleneck is sorting double floats. Each iteration took 1.8 seconds on my old laptop; I’m guessing that could easily be 10-20 times faster with vectorisation and parallelisation.

    In another post, I’ll show how using the same surrogate dual optimisation algorithm to mimick Lagrangian decomposition instead of Lagrangian relaxation guarantees an iteration count in \(\mathcal{O}\left(\frac{\lg \#\mathrm{nonzero}}{\varepsilon\sp{2}}\right)\) independently of luck or the specific linear constraints.


    1. Yes, I have been banging my head against that wall for a while.

    2. This is equivalent to minimising expected loss with random bits, but cleans up the reduction.

    3. When was the last time you had to worry whether that log was natural or base-2?

    4. The doubling trick essentially says to start with an estimate for some parameters (e.g., width), then adjust it to at least double the expected iteration count when the parameter’s actual value exceeds the estimate. The sum telescopes and we only pay a constant multiplicative overhead for the dynamic update.

    5. I think I computed the \(\log\) of the number of decision variables instead of the number of constraints, so maybe this could have gone a bit better.

    OCaml Weekly News: OCaml Weekly News, 23 Apr 2019

    1. Wrapping C++ std::shared_ptr and similar smart pointers
    2. OCaml 4.08.0+beta3
    3. Menhir and preserving comments from source
    4. ppx_protocol_conv 5.0.0
    5. Orsetto: structured data interchange languages (preview release)
    6. Searching for functions
    7. Other OCaml News

    Charles Petzold: Reading “Stoner” by John Williams

    After reading several articles about “the best novel you’ve never heard of,” I finally surrendered and determined for myself the cause of this unadulterated acclaim. The title of this novel is peculiar: Although John Williams’ Stoner was published in 1965, the title refers not to a hippie drug addict but to the staid and stern William Stoner, who for four decades teaches English at the University of Missouri.

    ... more ...

    Michael Geist: The LawBytes Podcast, Episode 8: LawBytes Lecture – What the Canadian Experience Teaches About the Future of Copyright Reform

    Earlier this spring, I delivered a keynote address at the Australian Digital Alliance’s 2019 Copyright Forum. The ADA is a leading voice on copyright issues in Australia and its annual Copyright Forum brings together government, creators, education, libraries, and the broader public to explore copyright issues. Coming off a holiday weekend with many celebrating Easter or Passover, this week’s Lawbytes podcast takes a different approach with a Lawbytes lecture, an audio recording of the ADA keynote, which used real data to dispel the misleading claims about the impact of Canada’s copyright 2012 reforms.

    The podcast can be downloaded here and is embedded below.  Subscribe to the podcast via Apple Podcast, Google Play, Spotify or the RSS feed. Updates on the podcast on Twitter at @Lawbytespod.

     

    Episode Notes:

    YouTube video version of the ADA keynote address

    The post The LawBytes Podcast, Episode 8: LawBytes Lecture – What the Canadian Experience Teaches About the Future of Copyright Reform appeared first on Michael Geist.

    The Shape of Code: OSI licenses: number and survival

    There is a lot of source code available which is said to be open source. One definition of open source is software that has an associated open source license. Along with promoting open source, the Open Source Initiative (OSI) has a rigorous review process for open source licenses (so they say, I have no expertise in this area), and have become the major licensing brand in this area.

    Analyzing the use of licenses in source files and packages has become a niche research topic. The majority of source files don’t contain any license information, and, depending on language, many packages don’t include a license either (see Understanding the Usage, Impact, and Adoption of Non-OSI Approved Licenses). There is some evolution in license usage, i.e., changes of license terms.

    I knew that a fair-few open source licenses had been created, but how many, and how long have they been in use?

    I don’t know of any other work in this area, and the fastest way to get lots of information on open source licenses was to scrape the brand leader’s licensing page, using the Wayback Machine to obtain historical data. Starting in mid-2007, the OSI licensing page kept to a fixed format, making automatic extraction possible (via an awk script); there were few pages archived for 2000, 2001, and 2002, and no pages available for 2003, 2004, or 2005 (if you have any OSI license lists for these years, please send me a copy).

    What do I now know?

    Over the years OSI have listed 110107 different open source licenses, and currently lists 81. The actual number of license names listed, since 2000, is 205; the ‘extra’ licenses are the result of naming differences, such as the use of dashes, inclusion of a bracketed acronym (or not), license vs License, etc.

    Below is the Kaplan-Meier survival curve (with 95% confidence intervals) of licenses listed on the OSI licensing page (code+data):

    Survival curve of OSI licenses.

    How many license proposals have been submitted for review, but not been approved by OSI?

    Patrick Masson, from the OSI, kindly replied to my query on number of license submissions. OSI doesn’t maintain a count, and what counts as a submission might be difficult to determine (OSI recently changed the review process to give a definitive rejection; they have also started providing a monthly review status). If any reader is keen, there is an archive of mailing list discussions on license submissions; trawling these would make a good thesis project :-)


    churchturing.org / 2019-05-09T23:02:59