Immigration and Jobs

Northside Storm · May 31, 2016

Cohete Rojo said: ↑

I've brought up H1B visas in this thread (see above quote), and even corrected you on Sergey Brin's immigration status.

This is the code I am sharing for now - as I said above.

I provided information to get you started in R. If you, or any other CF member, should need further assistance, please let me know.
Click to expand...

you could've just copy + pasted: https://cran.r-project.org/

in case any of us wanted to get a horribly misformed dataframe.

The data I provided was to talk about what you were trying to refute, not absolutely everything you have ever said in your life. Jesus Christ.

Quote:
Millions of more people who would create that effect--in case you hadn't bothered to remember that immigrants are twice as likely as natives to found companies and had a near-null effect on wages.

Quote:
You keep saying this but you don't provide the data.
Click to expand...

Does it say legal immigrants? Does it say immigrants of a certain status?

No, it says immigrants.

You're adding conditions now because apparently you can't procede with immigration status, but you're going to randomly include randomass variables like family income in some grabass horrorshow of irrelevancy.

I even gave you a method for you to derive immigration status that you refused to take.

And can you bring up when I said anything about Sergey Brin's immigration status? Specifically that he is a H1-B holder? I see talking about the co-founder of Zenefits, and Yann LaCun, which is not Google nor Sergey Brin last time I checked.

Sergey would have been probably on the student visa class given his thesis work at Stanford. I feel like it would be out of character for me to say what specific visa class he was a part of but surprise me if you can. All I see is you bringing up Sergey for no ******* reason.

Cohete Rojo · May 31, 2016

Northside Storm said: ↑

I'm aware of that. I'm telling you your columns and your rows are probably screwed up as a function of how you designed your tbl_df and how you combined different CSVs. I am not questioning your use of tbl_df given you're dealing with a larger though not unpleasantly large data set, I am questioning your particular tbl_df and the logic of how you decided to wrangle data.
Click to expand...

I don't see anything wrong with it other than I could have stopped at tbl_df and written over x.file with the select() function. You can take it out, use as.tbl or some other wrapper, or you could opt to not use a wrapper and deal with the data frame itself. I like the way dplyr's wrapper's don't print all columns when head() is called.

Northside Storm said: ↑

or you can not use R, which until Hadley Wickham came about, utterly sucked balls at taking information from the web. No, you don't have to tell me what libraries in R are using one another, that's not the problem I'm bringing up.

Or you could even manually have stored things instead of scraping things together if you couldn't do it properly in R. jesus.

One of these weekends, if I have the time, I'll do the whole ******* thing with requests + beautifulsoup and it'll take a hell of a lot less time and documentation to get through it all. Of course this is your data you asked for to "facet and aggregate".

Is this what is taught in the States? maybe that's why H1B visas are so in demand.
Click to expand...

The code I shared will download all the relevant CSV files within the link you gave; it will also "store" the links in the hrefs.csv vector. httr::GET() is another package/function you can use to download a file, if you so chose. Personally I like R - indentation does not change the nature of the code.

Northside Storm · May 31, 2016

Cohete Rojo said: ↑

I don't see anything wrong with it other than I could have stopped at tbl_df and written over x.file with the select() function. You can take it out, use as.tbl or some other wrapper, or you could opt to not use a wrapper and deal with the data frame itself. I like the way dplyr's wrapper's don't print all columns when head() is called.
Click to expand...

I, again, don't have any problem with the file format itself and I understand why you'd want to treat your dataframe the way you do. yes, when you have more data than needed, it's nice to access head() and the first 5 values alone.

I am trying to question the contents of the dataframe and how it is currently organized. ex: since you didn't do a UNION, how are you dealing with the unique IDs per year, why no groupby() or whatever the f**k R equivalent there is, why 12 columns (what extra value do some of these columns bring), and I have a strong suspicion your parsing is off when it comes to the different CSVs you've read into memory but that has been lessened now that I look more carefully.

The code I shared will download all the relevant CSV files within the link you gave; it will also "store" the links in the hrefs.csv vector. httr::GET() is another package/function you can use to download a file, if you so chose. Personally I like R - indentation does not change the nature of the code.
Click to expand...

Just so we're clear, the bolded is what I was disputing.

I think now though, you're reading into a CSV a compendium of all links then using the CSV as your filtered input vector for your tbl_df, which is just a strange thing for somebody used to storing values as dicts, parsing them there on the spot with dictionary comprehension, then grabbing files directly through requests and the CSV module.

All this to say:

BOO R.

it's a strange thing to me that you hate Python for indentation reasons -- then you love R for having the strangest syntax known to man???

i almost got a much more efficient structure, but you made me think in regex, and now I gotta run. still this below = all of the CSV vector bulls**t you have to do with R.

Northside Storm · May 31, 2016

#libraries to import
import requests
from bs4 import BeautifulSoup
import re

#get all of the s**t
r = requests.get("http://www.kauffman.org/microsites/kauffman-index/about/archive/kauffman-index-of-entrepreneurial-activity-data-files")
soup = BeautifulSoup(r.text, 'html.parser')
print (soup.prettify)

#parse all of the s**t
links = soup.findAll(href=re.compile("\.csv$"))
for link in links:
print(link.get('href'))

linklist = list(links)
print(linklist)

(I don't like sets, better list slicing notation)

Just ran that quickly in between another sprint item. If you run it in a Python notebook, you should be fine, I just need to fiddle with the regex to get it more precise (f**king regex.) and I'm halfway to what you've done.

Northside Storm · May 31, 2016

The indents are screwed but you only really need to indent the for loop

NewRoxFan · Feb 13, 2018

KingCheetah · Feb 14, 2018

NewRoxFan said: ↑

Click to expand...

KingCheetah · Feb 14, 2018

conquistador#11 · Feb 14, 2018

Do as I tweet, not as I do.

AroundTheWorld · Feb 14, 2018

Whatever happened to Northside Storm?

NewRoxFan · Feb 14, 2018

Meanwhile, there is real crime connected to immigration...

ICE lawyer in Seattle charged with stealing immigrant IDs

The chief counsel for U.S. Immigration and Customs Enforcement in Seattle has been charged with stealing immigrants’ identities.
Raphael A. Sanchez, who resigned from the agency effective Monday, faces one count of aggravated identity theft and another of wire fraud in a charging document filed Monday in U.S. District Court. Prosecutors with the Justice Department’s Public Integrity Section allege that Sanchez stole the identities of seven people “in various stages of immigration proceedings” to defraud credit card companies including American Express, Bank of America and Capital One.
Click to expand...

https://www.seattletimes.com/seattl...seattle-charged-with-stealing-immigrants-ids/

Forums

Immigration and Jobs

Northside Storm Member

Cohete Rojo Member

Northside Storm Member

Northside Storm Member

Northside Storm Member

NewRoxFan Member

KingCheetah Atomic Playboy
Supporting Member

KingCheetah Atomic Playboy
Supporting Member

conquistador#11 Member

AroundTheWorld Member

NewRoxFan Member

Share This Page

About ClutchFans

Rockets Content

Support ClutchFans!

Immigration and Jobs

Northside Storm Member

Cohete Rojo Member

Northside Storm Member

Northside Storm Member

Northside Storm Member

NewRoxFan Member

KingCheetah Atomic Playboy Supporting Member

KingCheetah Atomic Playboy Supporting Member

conquistador#11 Member

AroundTheWorld Member

NewRoxFan Member

Share This Page

KingCheetah Atomic Playboy
Supporting Member

KingCheetah Atomic Playboy
Supporting Member